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The Asia-Pacific Region is going through rapid urbanization along 
with thriving industrial development. Sustainable growth is, therefore, a 
common concern for the survival of its cities, as only developmental plans 

embracing the sustainability of the region's industries will ensure a bright future 
balancing economic, social, and ecological development. 

"Industrial sustainability is more important than ever, and cities not 
considering this first will put their future at peril,” says Xue Lan, a distinguished 
professor at Tsinghua University. 

To evaluate the current and future impact of regional industries on 
the ecological environment, the Center for Industrial Development and 
Environmental Governance (CIDEG) of Tsinghua University and the 
Asia-Pacific Economic Cooperation (APEC) China Business Council jointly 

launched the Asia-Pacific Industrial Sustainability Index (AP-ISI). The final 
goal of the AP-ISI is to help economies in the Asia-Pacific region balance 
sustainability and economic development. 


Sustainability in the Asia-Pacific region 

The Asia-Pacific region is vast and includes different levels of development. 
However, there is a broad consensus on the critical elements of sustainable 
development, such as green economic growth, social justice, environmental 
protection, and reduction of greenhouse gas emissions. Based on these general 
agreements, the AP-ISI evaluates industrial sustainability in five dimensions: 
driving forces, pressures, states, impacts, and responses. As economies in the 
same category tend to have similar experiences and challenges, the index divides 
economies into three types: major economy, emerging economy, and island 
economy. Nevertheless, for cities from emerging economies, the index also weighs 
in the extra challenges of trading off between economic growth and environmental 
impacts and implications due to political stability. Generally, the AP-ISI reflects a 
comprehensive and in-depth analysis of industrial sustainability and suggests areas 
of concern that need attention in order to support development opportunities for 
different regions and consequently to promote global development and improved 
quality of life in this region and worldwide. 


The Asia- Pacific Industrial Sustainability 
Index 2022 

The AP-IS| 2022 evaluates the industrial sustainability performance of 35 cities 
(regions) in the Asia-Pacific region from 2017 to 2020. The index highlights the 
top 10 cities (regions) for each first-level indicator and the top 10 cities (regions) 
for comprehensive ranking. Based on the different parameters analyzed, the 
cities from major economies maintain a leading position in driving forces, with a 
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strong presence in the top-ranking levels for most indicators. However, cities from 
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AP-ISI: A measure of sustainability promising 
prosperity and better opportunities in the 
Asia-Pacific region 


The human race faces unprecedented challenges to address the rapid deterioration of the world’s environmental quality and the depletion of 
its natural resources. At stake are the lives and health of many people as well as the potential for global economic growth. 


| 


Xue Lan and his collaborators launched the Asia-Pacific Industrial Sustainability Index, a yearly 
measurement of sustainability in the region. 

emerging and island economies also find their way to the top in terms of carrying 
capacity to tackle environmental stress. “It is still hard to undo the damage and 
waste done from all these social, economic, and political factors. Therefore, some 
emerging and island economies may perform much better than their developed 
counterparts. We can also see that capital is flowing into emerging markets that 
have not yet been exhaustively exploited for their resources and environment. 
This, of course, is both an opportunity and a challenge for these latecomers,” says 
Lan, also a professor and cochair of the CIDEG academic committee and the lead 
specialist for the Sustainable China Industry Development Initiative 2022. 


Recommendations for a brighter future 

The AP-ISI and its global report aim to inform the public, businesses, and local 
governments about the progress of industrial sustainability in the Asia-Pacific 
region. Its ultimate goal is to raise public awareness of the need to balance 
industrial development and sustainability and empower residents’ involvement 

in policy processes. The index will also help local governments evaluate their 
successes and challenges in promoting and achieving sustainable development. 
Finally, it can help identify regional and global industrial and sustainable 
development trends and guide the investment behaviors of businesses operating in 
the Asia-Pacific region and beyond. 


Visit: www.science.org/content/resource/asia-pacific-industrial-sustainability- 
index-cities 
Sponsored by 
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A long-exposure photograph shows the Clepsydra 
Geyser erupting on a moonlit night in Yellowstone 
National Park's Lower Geyser Basin. Geysers 

and other hydrothermal features in Yellowstone 
National Park are fueled by heat from a magma 
reservoir in Earth's upper 
crust. New seismic imaging 
of Yellowstone has provided 
further insight into the 
volume and distribution of 
magma in the subsurface. 
See pages 945 and 1001. 
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EDITORIAL 


Science urgencies for Brazil 


ast month in Egypt at the United Nations Cli- 

mate Change Conference, Brazil’s president-elect 

Luiz Inacio Lula da Silva reaffirmed his pledge to 

make Brazil a global leader in addressing climate 

change and deforestation. However, when Lula 

takes the reins on 1 January, he will step into a 

situation that is quite different from when last 
he was president (2003-2010). At that time, he priori- 
tized science and education in all government actions 
and guided Brazil to a prosperous social state and sus- 
tainable economy. This time, he will face a much differ- 
ent local and global scenario. The world is still going 
through an unprecedented health crisis, and like other 
countries, Brazil needs to establish new ways of tackling 
the consequential social, educational, 
environmental, and economic prob- 
lems. This will be especially challeng- 
ing given that since 2016, the country 
has taken the opposite direction of 
most governments, cutting invest- 
ments in education at all levels and 
in science, technology, and innovation 
(ST&I). The question is how Lula can 
immediately address the serious pov- 
erty and hunger crisis in Brazil while 


" DraZzil 


must ... stand 
behind building 


a future for 


those in Brazil (23%). The number of adults in Bra- 
zil who do not progress beyond secondary education 
(precollege) is one of the highest among these nations. 
Support for aid programs that help students with disad- 
vantaged backgrounds to complete a tertiary education 
has been deeply cut over the past 4 years. Brazil also has 
not yet reached full enrollment of children in primary 
school, one of the benchmark goals for the United Na- 
tions Sustainable Development Goals 2030 Agenda. 
Brazil’s economy ranges from 10th to 12th position 
(depending on the ranking system) in the world, and 
this outcome is primarily a direct consequence of the 
investments made in ST&I in earlier years. For exam- 
ple, Brazil’s agricultural success in boosting soybean 
production is a result of progress at- 
tained mainly through research at 
public universities and institutions. 
Unfortunately, over the past years, 
the Bolsonaro government has cut re- 
search funding, which will soon affect 
Brazil’s global competitiveness. 
Above all, Brazilian science lacks 
constancy and continuity in ST&I 
investments. The National Fund for 
Scientific and Technological Devel- 
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also restoring the environment and a today’s youth HM opment is now the only source for hbnader@abc.org.br 
competitive and equitable economy to = financing ST&I. The Bolsonaro gov- 
the country. ernment claims to have balanced ex- 
In the 21st century, countries are penditures by restricting use of the 
focusing on developing capacities to create scientific | fund. The new administration should reinvigorate this 
knowledge and technologies to improve societal well-be- | resource and embrace ST&I as a priority. 
ing. Science and education thus emerge as priorities. It is Brazil urgently needs to commit to strategies that 
therefore urgent for Lula to recompose the budget forthe | achieve leadership in the age of knowledge, such as 
country’s education and ST&I sectors. However, the new | investing in “big data,” artificial intelligence, and com- 
government will face an unparalleled challenge. Com- | munication technologies. These areas will continue to 
bined with the historic cuts in social and ST&I policies, | revolutionize the global job market but stand to in- 
the budget for 2023 presented by the Bolsonaro admin- | crease inequalities in Brazil unless special attention is 
istration does not consider basic expenditures that will | given to proper education. Also, there is a need to cul- 
necessarily have to be resumed. Moreover, Brazil’s con- | tivate multidisciplinary and transdisciplinary research 
gress is the most conservative since the country’s return | that consider societal and environmental impacts to 
to democracy. But investment in science and education | ensure that science broadly supports the country’s eco- 
is especially important because of Brazil’s young popula- | nomic and social development. 
tion, who need to be adequately educated and be given Twenty years after his first presidency, Lula returns 
the kinds of opportunities that innovation can create. and faces his biggest obstacles yet. The good news is 
Some indicators of education as well as of ST&I show | that he still champions science and education. The 
that as an average, across countries belonging to the | people of Brazil must come together and stand behind 
Organisation for Economic Cooperation and Develop- | building a future for today’s youth. 
ment, 47% of adults (25 to 34 years old) have attained 
tertiary education (a bachelor’s degree), compared with — Helena B. Nader 
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Students at Tsinghua University held up sheets of 
paper on which they wrote the Friedmann equation 
(left) during protests across China against government 
repression. The equation, describing the expansion of 
space, served as a sly reference to “freed man.” 


—= 


early all shark species hunted for their fins must 
be caught sustainably, according to new trade 
rules adopted last week under the Convention 
on International Trade in Endangered Species of 
Wild Fauna and Flora. In a move that supporters 
called historic, 183 countries and the European 
Union voted to place nearly 100 species of threatened 
sharks and sharklike rays on the treaty’s Appendix II, 
roughly tripling the number that must be 
managed to avoid overexploitation. Within 
1 year, nations exporting shark fins or meat 
must certify the animals were caught legally 


Monkeypox gets a neutral name 


INFECTIOUS DISEASES | The World Health 
Organization (WHO) announced this week 
it will start referring to monkeypox disease 
as “mpox” (pronounced “em-pox”) after the 
current name drew criticism as evoking 
racist stereotypes and inviting stigmatiza- 
tion. It is also a misnomer: The virus was 
first identified in laboratory monkeys but is 
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Blacktip reef sharks are 
among dozens of species 
for which international 
trade will be regulated. 


most likely carried by rodents in the wild. 
During a 1-year transition period, WHO 
will use both names. Earlier this year, the 
agency changed the names of the two dif- 
ferent clades, or branches, of monkeypox 
viruses that had been based on the regions 
where they were first identified. The Congo 
Basin clade became clade I and the West 
African clade, clade II. Weekly monkeypox 
cases have declined globally since August, 


and sustainably. Shark populations have shrunk for 
7 decades because of a lack of fishing regulations and 
enforcement (Science, 11 November, p. 617). The trade 
in fins, which are used for soup, has been particularly 
devastating, putting 61 species in danger of extinction. 
Trade in shark products was worth nearly $1 billion in 
2015, according to the most recent broad review by the 
U.N. Food and Agriculture Organization. Also newly 
listed are animals overexploited for the in- 
ternational pet trade, including more than 
160 species of glass frogs and 50 kinds of tur- 
tles and tortoises. 


but hundreds of cases are still reported 
every week and health authorities continue 
to call for at-risk people to be vaccinated. 


Embryo-editing scientist reboots 


GENE THERAPY | He Jiankui, who in 2018 
conducted a widely condemned experiment 
in which his team edited the genes of human 
embryos and later implanted them into their 
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mothers, says he has opened a new lab to 
develop “affordable” gene therapies. In 2018, 
Chinese officials detained He, a biophysicist, 
for using the CRISPR gene editor on the 
embryos, created through in vitro fertiliza- 
tion. The experiment led to the birth of 
three babies. A court convicted him of illegal 
medical practices, and He was released from 
prison in April. Last week, He described his 
latest venture on Weibo, a popular social 
media platform in China: His lab in Beijing 
will aim to “overcome 3-5 genetic diseases 
within 2-3 years to benefit families with rare 
diseases.” He cautioned about the recent 
death of a man with Duchenne muscular 
dystrophy (DMD) who was in a trial to test 
a CRISPR-based gene therapy. “History tells 
us that when any new technology emerges, 
it is both an angel and a devil? He wrote on 
Weibo. “Blind pursuit of new technologies 
and aggressive advancement will definitely 
be punished by heaven.” He told Science he 
has asked Jack Ma, the billionaire head of 
the Alibaba Group, for $140 million to fund 
his new lab’s efforts against DMD. 


NSF rules tighten funding race 


INFRASTRUCTURE | Researchers seeking 
National Science Foundation (NSF) grants 
for research equipment will likely face 
longer odds under new rules that don’t 
require institutions to share the cost. This 
summer, Congress ordered NSF to sus- 

pend cost sharing for its $75 million Major 
Research Instrumentation (MRI) program 
and foot the entire bill for each successful 
proposal. In a new solicitation (NSF 23-519), 
the agency projects the number of awards 
next year will drop from the current 150 to 
100 to accommodate that change, which 

is intended to diversify the applicant pool 
and give less wealthy institutions a better 
shot at winning a grant. The reduction in 
awards will disappoint some applicants, says 
comparative biologist Cheryl Hayashi of the 
American Museum of Natural History, a past 
recipient of MRI grants. “But I don’t see a 
downside to having a more diverse pool.” 
NSF will allow each institution to submit up 
to four applications, up from three, provided 
the fourth proposal is for an environmen- 
tally sustainable instrument. 


Dam removal to boost salmon OK’d 


CONSERVATION | The world’s largest dam 
removal project will begin as soon as 2023, 
after U.S. regulators last month approved 
tearing down four hydroelectric dams on 
the Klamath River in Northern California 
and Oregon. The 17 November unanimous 
vote by the Federal Energy Regulatory 
Commission was the last regulatory hurdle. 
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Native American tribes and environmental- 
ists have for years sought removal of the 
dams, which were built in the early 20th 
century and block migrating salmon from 
reaching some 600 kilometers of habitat. 
Salmon runs on the river have dwindled to 
less than 5% of historic levels. 


Time and other units get tweaks 


STANDARDS | The controversial leap sec- 
ond, which time keepers add sporadically to 
keep atomic clocks aligned with Earth’s rota- 
tion, will be axed in 2035, the International 
Bureau of Weights and Measures (BIPM) 
decided on 18 November. Devised in 1972 
and used 27 times since, the leap second 
wreaks havoc with modern-day telecom- 
munications, banking, and other networks. 
Its abandonment means that astronomical 
time, based on Earth’s rotation, will slowly 
diverge from Coordinated Universal Time, 
based on the vibrations of cesium in 

atomic clocks. BIPM plans to stop adding 
leap seconds for 100 years, by which time 
someone may have figured out a long-term 
fix for the problem. In addition, BIPM added 
new prefixes to the International System 

of Units to define very big and very small 
measurements. For example, 1 ronnameter 


mudflats and sandy beaches along 
the China Sea and elsewhere. 


EVOLUTION 


(Rm) is 1 billion billion billion meters and 

1 quettameter (Qm), 1000 times bigger 
still; 1 rontometer (rm) is one-billionth 

of a billionth of a billionth of 1 meter and 
1 quectometer (qm), one-thousandth of that. 


Cannabis research to open up 


BIOMEDICINE | The U.S. Congress has 
approved its first stand-alone bill enabling 
marijuana research and sent it to President 
Joe Biden, who is expected to sign it. The 
measure directs the Drug Enforcement 
Administration (DEA) to set up a stream- 
lined system allowing scientists to register 
to study cannabis for medical purposes. 

The legislation also orders DEA to speedily 
register new growers, including universi- 
ties, to raise and distribute it for research. 
The bill requires the U.S. attorney general 
to conduct yearly assessments of whether 
there is an adequate, uninterrupted supply of 
cannabis for research. The Senate passed the 
bill, the Medical Marijuana and Cannabidiol 
Research Expansion Act, on 16 November, 
following a lopsided vote of approval by the 
House of Representatives in July. Separately 
in October, Biden ordered the U.S. attorney 
general to consider reclassifying the drug, 
which would also make it easier to study. 


In a first for animals, clam makes its own antibiotic 


atural antibiotics typically come from bacteria or molds. But some clams 

make their own erythromycin, a study has found—the first animals reported to 

possess this ability. The spotted hard clam (Meretrix petechialis) has a mucus- 

covered outer lip that contains specialized antibiotic-producing cells, according 

to an international research team. These may protect the clams, which lack 
adaptive, lymphocyte-based immune systems, from disease. The scientists found 
no sign of erythromycin-producing bacteria in the clam’s tissues; instead they 
noticed its DNA contained an erythromycin-making gene that resembled one used by 
bacteria but differed enough that the invertebrate version might have evolved inde- 
pendently. The researchers found the gene in all the clam’s life stages. Its genome 
also contains other genes needed to produce erythromycin, and a related species of 
clam possesses these antibiotic genes as well. The findings suggest scientists can 
engineer cells in other animals to produce their own antibiotic, the authors write this 
week in the Proceedings of the National Academy of Sciences. 
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Indictment of monkey importers 
could disrupt U.S. research 


Alleged Cambodian smuggling ring poses dangers to wild 
macaques and the drug studies they’re used in 


By David Grimm 


he indictment of several members of an 
alleged international monkey smug- 
gling ring is sending ripples through 
the U.S. biomedical community. The 
U.S. Department of Justice (DOJ) 
has charged two Cambodian wildlife 
officials and several members of a Hong 
Kong-based primate supply company 
with illegally exporting hundreds—and 
potentially more than 2000—cynomolgus 
macaques, an endangered species, to 
the United States for research. The ani- 
mals were reportedly captured from the 
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wild in Cambodia and falsely labeled as 
captive-bred. 

The indictment, which carries multiple 
felony charges, will likely exacerbate the 
shortage of these monkeys, used in every- 
thing from drug safety testing to vaccine 
research, says Dave O’Connor, a virologist 
at the University of Wisconsin, Madison, 
who uses cynomolgus macaques to study 
infectious disease. Still, he says, the main 
priority should be stopping this illegal 
trade, for the sake of research and the ani- 
mals themselves. “These sorts of unscrupu- 
lous actors give a black eye to an already 
heavily scrutinized industry.” 


A family of cynomolgus macaques 
in the wild in Cambodia. 


Animal advocacy groups are pushing to 
ban further monkey imports until more is 
known about their provenance. And some 
experts in the biomedical community are 
suggesting moving the breeding of this spe- 
cies of macaque to the United States and 
collecting more genetic and pedigree in- 
formation on the monkeys that come from 
overseas. They also say labs should find 
ways to use fewer of these animals. 

It’s unclear how many of the animals in 
question have been used in research, but 
one of two companies that received them is 
the largest private supplier of monkeys to 
U.S. research labs, Science has learned. The 
company, Inotiv, tells Science that “while 
we do not yet know if these allegations will 
be proven true, Inotiv strongly condemns 
any and all unauthorized trading/importa- 
tion of endangered species. ... To confirm 
our screening processes are (and were) 
solid, we have plans in place to conduct the 
necessary audits to ensure excellence and 
provide transparency.” 

Cynomolgus macaques, also known as 
long-tailed macaques or cynos, are—by far— 
the monkey species most imported to the 
United States. Nearly 30,000 entered the 
country last year, according to data from 
the U.S. Centers for Disease Control and 
Prevention (CDC), which regulates the im- 
port of nonhuman primates. Most cynos are 
used by pharmaceutical and biotechnology 
companies. (Rhesus macaques, another spe- 
cies common in U.S. biomedical research, 
are mostly used by the academic commu- 
nity, which largely obtains them from na- 
tional primate research centers.) Cynos are 
also the main monkey species imported to 
Europe for research. They are typically bred 
in large facilities in Asia. 

China used to be the main supplier of 
cynos—exporting approximately 30,000 in 
2018—but the country has shut off its supply, 
which experts have attributed to the trade 
war with the United States and China’s de- 
sire to beef up its own biomedical industry. 
Several countries, mostly in Southeast Asia, 
stepped in to fill the gap. Cambodia now ex- 
ports the largest share of cynos—more than 
29,000 in 2020, the vast majority of which 
were shipped to the United States. 

Though the actual number of cynos in the 
wild is unclear, the International Union for 
Conservation of Nature downgraded the sta- 
tus of the monkeys from vulnerable to endan- 
gered this year, citing growing demand from 
the research industry as a factor that could 
incentivize illegal trade. 

The need for cynos has indeed grown, espe- 
cially during the pandemic. The species was 


science.org SCIENCE 


PHOTO: GERMAN VOGEL/ALAMY 


NEWS 


one of the main animal models employed to 
test COVID-19 vaccines, and researchers are 
increasingly using the monkeys to study Al- 
zheimer’s, Parkinson’s, and other diseases. 
“There is a sky-high demand for these ani- 
mals,” O’Connor says. 

That may be what’s fueling the alleged 
illegal trade in Cambodia. According to 
the DOJ indictment, two high-ranking 
employees of Vanny Resources Holdings, 
a Hong Kong-based company that breeds 
monkeys for research, paid millions to 
black market suppliers and Cambodian 
wildlife officials to capture thousands of 
cynos from national parks and other pro- 
tected areas of Cambodia, and to fake their 


paperwork to indicate the animals had 
been bred in captivity. 

Nearly 1500 of these “laun- 
dered” cynos arrived in the 
United States from 2018 to 2020, 
according to the indictment, 
with potentially hundreds more 
in 2021. They appear to have 
ended up at facilities in Flor- 
ida and Texas. The companies 
running the facilities are not 
named in the indictment, but in 
a filing with the U.S. Securities 
and Exchange Commission, Inotiv disclosed 
that its principal supplier of nonhuman 
primates was the target of the DOJ probe, 
indicating that the company gets most of its 
monkeys from Vanny. 

Last year, Inotiv purchased the major 
research animal supplier Envigo (under 
fire recently for a series of animal welfare 
violations at one of its beagle breeding fa- 
cilities), making it the world’s largest sup- 
plier of nonhuman primates for research. 
The company currently houses more than 
9000 monkeys—the vast majority cynos— 
which it sells to private and academic labs. 

Cindy Buckmaster, spokesperson for 
Americans for Medical Progress, which 
advocates for the use of animals in sci- 
entific studies, says there’s not much that 
companies like Inotiv—or the labs they sell 
to—can do to check the provenance of the 
animals they receive. “We have to take the 
documentation at face value,” she says. 

Still, she calls the alleged illegal import 
of cynos “horrible” for the animals, and 
a violation of the trust both the scientific 
community and the public put in animal 
research. She says wild-caught monkeys 
carry viruses that could infect other animals 
they’re housed with, or humans. And they’re 
prone to stress just from being around peo- 
ple for the first time, which could result in 
“very different data.” 

The animal rights group People for the 
Ethical Treatment of Animals has asked 
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sky-high 
demand for 
these animals.” 
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CDC to suspend all nonhuman primate ex- 
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ports from Cambodia. It also asked the U.S. 
National Institutes of Health “to determine 
the precise origin of every [cyno] imported 
from Cambodia since 2017 and currently in 
publicly funded laboratories.” 

In a statement, Cambodia’s ministry of 
agriculture said it was “surprised and sad- 
dened” by the indictment, and is committed 
to upholding all laws governing the interna- 
tional trade of animals. It also denied that 
any of its exported monkeys had been cap- 
tured from the wild. 

“If this is the reaction from the supply 
side, then we think there needs to be much 
stricter controls on the demand side,” says 
Eric Kleiman, a researcher at the Animal 
Welfare Institute, an animal advocacy 
group that has closely followed 
the issue. “If monkeys are to 
be used in research in the U.S., 
there is a responsibility to en- 
sure they are well cared-for ... 
and sourced responsibly.” 

Sarah Kite, co-founder of Ac- 
tion for Primates, a U.K.-based 
advocacy organization, agrees. 
She notes the European Union 
is poised to pass a law that will 
ban the import of all wild-caught 
animals and their offspring. “That’s the only 
way to ensure they’e not getting wild-caught 
animals,” she says. “The research community 
needs to be held accountable for what’s hap- 
pening to this species.” 

O’Connor suggests one solution: ensuring 
that all captive-bred animals have extensive 
pedigree records, which could be kept in 
a global registry. He also suggests geno- 
typing every monkey used in research 
studies to better trace their origins. 

Major pharmaceutical and biotech com- 
panies contacted by Science either declined 
to comment on the issue or did not respond. 

“T think [the indictment] is going to con- 
strict the pipeline even more,” says a U.S. 
consultant on industry and academic mon- 
key research who has worked in the field 
for decades but asked not to be named 
because of concerns of damaging relation- 
ships with his clients. The U.S. biomedical 
community should invest in breeding these 
animals domestically, he says. “We need to 
migrate away from shipping an animal 
from halfway around the world, where we 
can’t control where it came from.” 

But the community also needs to work 
to reduce the number of monkeys it uses, 
he says. That could be accomplished by 
designing studies to require fewer ani- 
mals, and working with regulators to re- 
quire fewer animals for research such as 
drug safety studies. “That’s easier, faster, 
and less expensive than building up a big- 
ger pipeline.” 
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CRISPR is so 
popular even 
viruses 

may use it 


Thousands of phages 
appear to have stolen the 
gene-cutting mechanism 


By Mitch Leslie 


he gene-editing tool CRISPR started 
out as a bacterial defense against in- 
vading viruses. But it turns out the 
intended targets have stolen CRISPR 
for their own arsenals. A new study 
reveals that thousands of the bacteria- 
attacking viruses known as bacteriophages 
(phages, for short) contain the CRISPR sys- 
tem’s genetic sequences, suggesting they may 
use them against rival phages. The finding 
could boost CRISPR’s laboratory usefulness. 

The discovery “opens doors for possible 
new applications of CRISPR systems,” says 
genomicist Mazhar Adli of Northwestern 
University’s Feinberg School of Medicine, 
who wasn’t connected to the research. 

Like other viruses, phages cannot repro- 
duce on their own. Instead, they hijack bacte- 
ria’s molecular machinery, often killing their 
hosts in the process. The CRISPR system en- 
ables bacteria to fight back. It includes repet- 
itive stretches of DNA that match sequences 
of previously encountered phages. If these 
same phages attack a bacterium again, it 
uses this repetitive DNA to encode strands of 
RNA that can steer a partner enzyme, which 
acts like a pair of genetic scissors, to cut the 
phage’s genome at specific places. 

For about the past decade, scientists have 
been working to turn this microbial immune 
defense into a gene-editing technique for 
myriad uses, including improving crop de- 
fenses, detecting pathogens, and fighting dis- 
eases such as cancer. 

Characteristic DNA that encodes compo- 
nents of the CRISPR system had previously 
turned up in a handful of phages. At first, 
scientists regarded these finds as mere “cu- 
riosities,” says structural biologist Jennifer 
Doudna of the University of California (UC), 
Berkeley, who shared the 2020 Nobel Prize 
in Chemistry for showing how to tailor 
the CRISPR system to target particular se- 
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quences. “But they got us wondering if these 
systems were more common.” 

To find out, Doudna, UC Berkeley geo- 
microbiologist Jillian Banfield, and their col- 
leagues went looking for additional examples 
of CRISPR in the phage world. They probed 
DNA from a variety of environments that are 
rich in bacterial hosts for the viruses, includ- 
ing soil and the human mouth. This trawl 
uncovered more than 6000 types of phages 
that contain CRISPR system DNA, the sci- 
entists reported last week in Cell. They also 
examined phage genome sequences that had 
been posted to online databases and found 
even more instances of the CRISPR-carrying 
viruses. Although fewer than 1% of phages 
sport the sequences, the researchers did not 
expect “such a broad distribution of an anti- 
phage system in phages,” Doudna says. 

Why would phages acquire a system that 
evolved to thwart them? The most likely rea- 
son, Doudna says, is to beat the competition. 
Multiple viruses can attack a bacterium at the 
same time, leading to “phage wars” inside an 
infected cell, she says. Bacteria are also vul- 
nerable to rogue DNA strands known as plas- 
mids that coerce the cells into copying them. 
By destroying these rivals with the CRISPR 
system, phages “can have the replication ma- 
chinery all to themselves,’ Doudna says. 

The phages presumably swiped these 
CRISPR system sequences from their mi- 
crobial victims, she says. Since then, the vi- 
ruses have customized the systems for their 
own ends. For instance, some phages seem 
to have lost the capacity to generate certain 
molecules that can kill bacteria, possibly to 
preserve their hosts to produce more phages. 

The phages’ CRISPR tricks may inspire 
new biotechnology. For instance, most 
CRISPR-based approaches now rely on the 
enzyme Cas9 to cut DNA. However, Cas9 is 
so large it cannot fit into some viruses used 
to genetically modify cells. A number of 
phages, however, boast a slimmed-down ver- 
sion known as Cas-lambda that is about 50% 
smaller, Doudna and Banfield’s team found. 
Adli says this smaller enzyme could allow 
new gene-editing uses for CRISPR, such as 
altering plant genomes, though researchers 
would first need to overcome several bio- 
engineering hurdles. 

Microbiologist Joseph Bondy-Denomy 
of UC San Francisco says Doudna and 
Banfield displayed a “[John] Lennon-[Paul] 
McCartney” level of synergy in ferreting out 
so many CRISPR-bearing phages that had 
eluded other scientists. Still, he wants to 
see evidence the viruses actually put their 
CRISPR systems to use when they invade 
bacteria. Bondy-Denomy also _ suspects 
many more phages that wield CRISPR are 
waiting to be discovered. “The next step is 
more,” he says. & 
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Pandemic accelerates ‘GRExit’ 


Nearly all U.S. Ph.D. programs have dropped the 
standardized GRE exam as an admissions requirement 


By Katie Langin 


he University of Michigan’s bio- 

medical Ph.D. program was a lonely 

outlier in 2017 when it announced it 

would stop asking applicants to sub- 

mit scores for the Graduate Record 

Examination (GRE) General Test. 
At the time, the standardized exam was a 
nearly universal requirement for Ph.D. pro- 
grams at U.S. universities. But the Michigan 
program now has plenty of company. The 
vast majority of STEM Ph.D. programs have 
stopped requiring GRE scores, according to 
an investigation by Science, and the number 
of tests taken each year has plummeted. 

The COVID-19 pandemic, unease about 
whether the test puts students from less 
privileged backgrounds at a disadvantage, 
and doubts about how well GRE scores 
predict grad school success all helped drive 
the changes. But whether they will be per- 
manent and how they are affecting the ap- 
plicant pool or incoming class composition 
remain to be seen. 

To quantify the trend, Science examined 
the application requirements for Ph.D. pro- 
grams in eight disciplines at 50 top-ranked 
US. universities. Only 3% currently require 
prospective students to submit GRE Gen- 
eral Test scores, compared with 84% 4 years 
ago. An additional 5% strongly recommend 
that prospective students submit scores. 
Others make it optional; one program’s 
website reads, “In certain cases, a strong 
GRE score submitted with your application 
can improve your chances.” But 36% of the 
programs explicitly state that GRE scores 
will not be accepted or reviewed as part of 
the admissions process. 

Early on, the so-called “GRExit” move- 
ment was mostly restricted to the life sci- 
ences. But the shift away from the GRE 
now touches every discipline. “It really did 
spiral” quickly, says Sarah Ledford, an as- 
sistant professor in the geosciences at Geor- 
gia State University, who maintains a list of 
earth sciences programs that don’t require 
GRE scores. In 2018, geology was the only 
discipline in which every single department 
Science examined required the GRE; now 
none does. 

Ledford attributes much of the shift to 
“a reckoning” around diversity. She and 
other scientists argue that the cost of the 


test—$220 per attempt, plus travel expenses 
and training materials—disadvantages 
students from lower socioeconomic back- 
grounds and dissuades them from applying 
to graduate school. COVID-19 provided an- 
other reason to drop the test requirement, 
as a move to online testing led to concerns 
about whether some students had access to 
a suitable testing environment. “This was 
low-hanging fruit during COVID for places 
to give it a shot,’ Ledford says. 

Those logistical concerns came on top of 
research indicating the GRE doesn’t predict 
whether a student will succeed in graduate 
school. “The data—they have to be relevant. 
... Otherwise it’s just noise,” says Jennifer 


Fewer tests taken 

The number of Graduate Record Examination (GRE) 
tests administered in the United States has dropped 
in recent years. 
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Gomez, an assistant professor of social work 
at Boston University who has studied the 
use of GRE scores in psychology admissions. 
“The GRE ... predicts nothing of substance 
beyond grades.” On top of that, the test “un- 
fairly privileges certain groups—white men 
in particular,’ she says. 

But not everyone is sold on the transi- 
tion. “I think it’s a mistake to remove GRE 
altogether,’ says Sang Eun Woo, a professor 
of psychology at Purdue University. Woo is 
quick to acknowledge the GRE isn’t per- 
fect and doesn’t think test scores should 
be used to rank and disqualify prospective 
students—an approach many programs have 
used in the past. But she and some others 
think the GRE can be a useful element for 
holistic reviews, considered alongside quali- 
tative elements such as recommendation let- 
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ters, personal statements, and CVs. “We're 
not saying that the test is the only thing that 
graduate programs should care about,’ she 
says. “This is more about, why not keep the 
information in there because more informa- 
tion is better than less information, right?” 

Removing test scores from consideration 
could also hurt students, argues Alberto 
Acereda, associate vice president of global 
higher education at the Educational Test- 
ing Service, the company that runs the GRE. 
“Many students from underprivileged back- 
grounds so often don’t have the advantage 
of attending prestigious programs or taking 
on unpaid internships, so using their GRE 
scores serves [as a] way to supplement their 
application, making them more competitive 
compared to their peers.” 

There’s no surefire method for “evaluat- 
ing people’s potential for doing original re- 
search,” says Danny Caballero, an associate 
professor of physics education at Michigan 
State University who has studied graduate 
admissions in physics. “And that’s just be- 
cause that is a really complicated thing to 


Pritchard, the program’s graduate direc- 
tor, says the policy switch hasn’t been con- 
tentious and the program has no plans to 
reinstate the GRE. “It is a new system to 
learn, but once you get into the groove of it 
... it probably takes no more extra time,” he 
says. Pritchard is hopeful the shift will result 
in a more diverse applicant pool. He says it’s 
too early to tell for sure because there’s so 
much year-to-year variability in applicant 
demographics. But since the change, “We 
have more applicants and more diverse ap- 
plicants and I would say the quality is as 
strong as ever.” 

In the long run, many GRExit proponents 
see the new admission requirements as a 
policy shift that will be hard to undo. “What 
I hear from students is this is absolutely 
something that they look for in considering 
where to apply,’ Ledford says. “They don’t 
want to take [the test].” 

Joshua Hall, who was an early advocate 
for dropping the GRE when he worked in 
graduate admissions for the biological and 
biomedical science program at the University 


Giving it a pass 


Since 2018, most STEM Ph.D. programs at 50 top U.S. universities have moved away from requiring the 


Graduate Record Examination (GRE). 
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do.” Like Woo, he’s advocated for the use 
of rubrics that ask reviewers to evaluate 
prospective students holistically, based on 
the strength of their academic preparation, 
research track record, initiative and perse- 
verance, and fit with the program. Before 
the pandemic, his program included GRE 
scores in its rubric—giving them a weight of 
10%—but when the program shifted to mak- 
ing the GRE optional, it eliminated those 
scores from the rubric. He has no regrets. 
“The work of science is not taking standard- 
ized tests. The work of science is being curi- 
ous,” he says. 

At Cornell University, the geological sci- 
ences graduate program also made the 
switch to a rubric-based holistic review 
shortly before announcing it would no 
longer accept GRE scores in 2020. Matt 
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of North Carolina, Chapel Hill, agrees that in 
the life sciences a shift back toward the GRE 
appears unlikely. “Admissions committees 
seem to have acclimated to the no-GRE for- 
mat,’ says Hall, now a senior program officer 
at the Howard Hughes Medical Institute. “I 
have heard no rumblings of going back.” 
But in other disciplines, that may not 
be the case. Dartmouth College’s computer 
science program has returned to requir- 
ing the GRE after waiving the requirement 
in 2020 and 2021. Carnegie Mellon Univer- 
sity’s psychology program made a similar 
change earlier this year. It later backtracked, 
however, and lifted the requirement again. 
When asked about the change in policy, the 
program’s graduate director, Vicki Helgeson, 
wrote in an email to Science, “The whole 


2 


thing is in a state of flux.” | 


Europe pledges 
to launch 
Mars rover 
delayed by war 


Replacing Russian launcher 
and lander means probe 
won't reach Mars until 2030 


By Daniel Clery 


fter repeated delays and the loss of 
its Russian collaborators, Europe’s 
ExoMars rover is go for launch 
again, in 2028, government minis- 
ters agreed last week. The rover was 
due to set off for the Red Planet in 
September on a Russian Proton rocket and 
land with a Russian-built craft, until the 
European Space Agency (ESA) cut ties with 
Russia after its invasion of Ukraine. At a 
budget setting meeting last week, ESA re- 
solved to launch the mission on a yet-to-be- 
determined U.S. rocket and develop its own 
lander—with some help from NASA. 

“This is fantastic news for science and 
for the search for signs of life elsewhere,” 
says Andrew Coates of University College 
London, principal investigator (PI) of a 
panoramic camera on the rover. “It’s some- 
thing positive: We still have a mission,” 
adds Valérie Ciarletti of the University 
of Paris-Saclay, PI of the rover’s ground- 
penetrating radar. The golf cart-size rover, 
dubbed Rosalind Franklin after the British 
DNA pioneer, carries a sample-collecting 
drill that can penetrate up to 2 meters 
underground, where signs of ancient life 
might be better preserved. 

The announcement came at the end of a 
budget meeting of ESA’s 22 member states 
that occurs every 3 years. Ministers approved 
€16.9 billion in funding over the next 5 years 
to cover science, exploration, rockets, Earth 
observation, and _ telecommunications. 
The nearly 17% increase over the previous 
budget was less than ESA management had 
asked for, and some programs will feel a 
squeeze. But the €2.7 billion going to explo- 
ration is enough to revive ExoMars, which 
includes multiple Mars missions. 

The project’s first phase delivered the 
Trace Gas Orbiter to Mars in 2016, as well 
as a landing demonstrator, called Schiapa- 
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relli, which failed less than a minute before 
touchdown because of a software error 
that switched off landing rockets too early. 
Rosalind Franklin was set to follow in 2018. 

Problems mating the rover with the 
Russian-made lander, called Kazachok, 
delayed the launch by 2 years. (Mars win- 
dows occur roughly every 2 years, when 
the planets align.) Trouble with the lander’s 
parachutes, solar panels, and wiring forced 
another delay to this year. Then in March, 
with Kazachok and Rosalind Franklin ready 
to go, war intervened. 

In building a new, Russia-free landing 
system, “we are not starting from scratch,” 
says Thierry Blancquaert, ESA’s ExoMars 
team leader. Most components on Schiapa- 
relli worked faultlessly, and ESA provided 
some systems on Kazachok, including its 
parachutes, radar, radio, and onboard 
computer—items engineers will extract 
from the lander and reuse. But no European 
manufacturer makes the kind of thrusters 
needed to set the 310-kilogram Rosalind 
Franklin gently on the surface. This is 
where NASA has come in, Blancquaert says. 
It has offered to source the thrusters from a 
U.S. manufacturer. 

NASA may also provide radioisotope 
heaters, power packs that use the decay of 
plutonium-238 to keep the rover from freez- 
ing during the frigid martian night. If it 
does, U.S. regulations require that the probe 
fly on a U.S. launcher, which would most 
likely be a SpaceX Falcon-Heavy or a Vulcan 
from United Launch Alliance. NASA would 
not confirm any details of its involvement, 
but Eric Ianson, the agency’s Mars Explora- 
tion program director, said in a statement: 
“NASA and ESA are planning key conversa- 
tions in the coming months on a potential 
collaboration for ESA’s ExoMars Rosalind 
Franklin rover mission, subject to the avail- 
ability of U.S. funding.” 

NASA’s’ Perseverance and _ China’s 
Tianwen-1 rovers have almost a decade’s 
head start on Mars. But an ESA study found 
that ExoMars will still provide good science 
after it lands in 2030—especially investiga- 
tions using its deep drill. For the scientists 
involved there is no doubt: “No mission can 
replace ExoMars,” Ciarletti says. 

The biggest losers in the new plan are 
scientists—both Russian and European— 
who designed instruments to be mounted 
on Kazachok. Because of the tight timetable 
for developing the new ESA lander, it will 
not do any science. “My feeling is of great 
discouragement due to the current geo- 
political situation,” says Francesca Esposito 
of the Astronomical Observatory of Capo- 
dimonte, who built a dust sensor for Kaza- 
chok. “But I am very happy that this has not 
stopped this great mission.” 
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Vaccines are in short supply 
amid global cholera surge 


Climate change and pandemic may be fueling outbreaks 


By Kai Kupferschmidt 


n 2 October, Haiti announced cholera 

had returned to the country for the 

first time since a decadelong epidemic 

ended in 2019. The disease killed close 

to 10,000 Haitians in those years; now, 

with violent gangs fighting for control 
over the country and the health system in dis- 
array, things could again get very bad. 

A few days later, Lebanon reported its first 
cholera cases since 1993, in a Syrian refugee 
and a health care worker in the north of the 
country. More cases quickly followed, and 
health organizations fear the Lebanese health 
system, hard-hit by a yearslong financial cri- 
sis, could buckle under the new burden. Two 
weeks later, Kenya, where millions of people 
have fled the worst drought in decades, re- 
ported its first cholera cases as well. 

The outbreaks are part of what the World 
Health Organization (WHO) calls an “un- 
precedented” surge in cholera cases, driven in 
part by climate change and fallout from the 
COVID-19 pandemic. Thirty countries have 
reported outbreaks this year, up from fewer 
than 20 on average the past 5 years. “There 
is a trend of more countries affected, in more 
regions, with a longer duration of outbreaks,” 
says Daniela Garone, the international medi- 
cal coordinator at Doctors Without Borders. 
A global cholera vaccine stockpile is falling 
short, forcing health organizations to ration 
doses—and rethink their control strategy. 

Cholera, spread through water or food 


contaminated with the bacterium Vibrio 
cholerae, can cause severe diarrhea and 
kills an estimated 20,000 to 140,000 people 
each year. A lack of clean drinking water, 
poverty, natural disasters, and armed con- 
flicts—such as the gang violence in Haiti— 
have traditionally fueled outbreaks. 

Just a few years ago, the prospects for 
reducing the burden seemed to brighten. 
A new, cheap vaccine, made from inacti- 
vated bacteria lacking part of their toxin, 
was approved in 2015; millions of doses 
were added to an international stockpile for 
emergency use. In 2017, WHO launched an 
ambitious new control strategy that relied 
on vaccination, improving sanitation, and 
widening access to clean drinking water 
and treatment. It was meant to cut cholera 
deaths by 90% by 2030. 

Extreme weather fueled by global warm- 
ing is part of the reason cases are instead 
trending up, says Philippe Barboza, who 
heads WHO’s Cholera and Epidemic Diar- 
rheal Diseases section. Droughts in West 
Africa and the Horn of Africa, massive 
flooding in Southeast Asia, and cyclones in 
southern Africa have displaced people and 
destroyed water and sanitation infrastruc- 
ture. COVID-19’s toll on the health care sys- 
tem made matters worse. The cholera case 
fatality rate in Africa was almost 3% in 2021, 
Barboza says, about three times higher than 
over the previous 5 years. “Every time we 
have investigated why the [death rate] was 
so high, the reason was the same: delayed 
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Aman receives an oral cholera vaccine at a 
barbershop in Akkar, Lebanon, on 12 November. 


access to health care,” he says. 

The 36 million vaccine doses expected to 
be shipped from the stockpile this year won’t 
be enough. Full protection requires two doses 
given 2 weeks apart, so the supply covers just 
18 million people—“not a lot when you think 
of [affected] countries like Bangladesh, Paki- 
stan, Ethiopia, Nigeria,’ Barboza says. Last 
month, the coordinating group that runs the 
stockpile announced it would stop admin- 
istering second doses to stretch supplies. A 
one-dose strategy has been used successfully 
before, says Charlie Weller, an immunologist 
at the Wellcome Trust, but it’s unclear how 
long protection will last. (Even the full two- 
dose regimen only protects for 3 years.) 

David Sack, an infectious disease expert 
at Johns Hopkins University’s Bloomberg 
School of Public Health, says he is “puzzled” 
by the decision to abandon the second dose 
instead of postponing it. A clinical trial in 
Cameroon that Sack and colleagues pub- 
lished this month suggests giving the sec- 
ond dose after 1 year instead of 2 weeks 
actually increases the immune response. 
But unless the outbreaks slow down in the 
coming months, the vaccine supply won’t 
allow even a delayed second dose, and be- 
sides, current WHO guidelines don’t allow 
an interval of more than 6 months. 

Meanwhile, Shantha Biotechnics in India, 
which manufactures 10% of the global chol- 
era vaccine supply, plans to stop production 
by the end of 2023. WHO Director-General 
Tedros Adhanom Ghebreyesus has urged 
Shantha and its parent company, Sanofi, to 
reconsider its decision, which would leave 
only one manufacturer, South Korea’s Eu- 
Biologics. The International Vaccine Insti- 
tute (IVI), a nonprofit based in South Korea 
that helped develop the cheap oral vaccine, 
is working with EuBiologics to increase its 
production capacity to some 80 million to 
90 million doses annually, says Julia Lynch, 
who directs IVI’s cholera program. It’s also 
helping a South African company named Bio- 
vac set up a facility to produce the shots, in 
a project funded by the Wellcome Trust and 
the Bill & Melinda Gates Foundation. But 
both efforts will take several years. 

Barboza emphasizes that vaccines are only 
one way to address the crisis. Cholera is easy 
to treat with oral rehydration solution, as 
long as it’s administered quickly. That makes 
access to basic health care crucial. “You don’t 
need a respirator, an intensive care unit, and 
God knows what,’ Barboza says. Mean- 
while, countries should keep working to im- 
prove access to clean water and sanitation, 
he says: “We might have lost a fight, but we 
have not lost the war.” 
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New look at jaw fossil rewrites 
centuries-old bird history 


Specimen flips evolutionary order of two key groups 


By Gretchen Vogel 


tiny broken bone, misidentified for 

decades, has upended scientists’ 

view of bird evolution. For nearly 

200 years, zoologists have divided 

birds into two categories: those with 

mobile joints in their upper jaw that 
allow their upper beak to move, and a much 
smaller group, including ostriches and 
emus, with a fused upper palate that gives 
them a less agile upper beak. This fused pal- 
ate is also found in dinosaurs, including the 
feathered ones that were ancestors to to- 
day’s birds, so zoologists thought ostriches 
and their kin were the evolutionary older 
group of birds, with mobile upper beaks ar- 
riving later in the history of birds. 

Now, paleontologists have identified a 
key skull bone in an ancient bird that lived 
nearly 67 million years ago— 
just before the devastating 
asteroid impact that killed 
off the dinosaurs. The bone, a 
piece of the upper jaw, closely 
resembles its mobile counter- 
part in today’s chickens or 
ducks, leading the research- 
ers to conclude the ancient 
bird also had a jointed up- 
per beak. They suspect the 
jointed beak was present in 
even older birds, because the 
rest of the specimen indicates 
it was a relative of Ichthyor- 
nis, another ancient bird that lived about 
20 million years earlier. Overall, the new 
analysis suggests a jointed beak was already 
present in the ancestor of modern birds, and 
a fused palate re-evolved later in ostriches 
and their kin. 

“It’s changing how we've been looking at 
the evolution of birds since the time of Lin- 
neaus,” says Christopher Torres, a paleonto- 
logist at Ohio University, Athens, who was 
not part of the new work. “We thought we 
had this worked out centuries ago, and now 
we're finally finding fossils that are showing 
that we didn’t. We got it mixed up.” 

The fossil, discovered more than 2 de- 
cades ago in a Belgian quarry near the 
Dutch border, was partially described for 
the first time in 2002, but many of its pieces 
remained inside a block of sediment. Juan 


An artist depiction of 
Janavis finalidens. 


Benito and Daniel Field, paleontologists 
at the University of Cambridge who study 
bird evolution, borrowed the fossil in 2018 
from the Natural History Museum of Maas- 
tricht so they could use computerized tomo- 
graphy to image these remaining bones. 

They hoped to find more of the animal’s 
skull, but initial scans only turned up ver- 
tebrae and ribs. Disappointed, they put the 
project aside for more than a year. When 
Benito returned to the fossil, he was puzzled 
by a bone the earlier analysis had identified 
as part of a shoulder but that seemed too 
small. He realized the piece was a fragment 
of a bone that had been broken in two. 

After identifying the companion piece 
and putting the two together, Benito, Field, 
and colleagues concluded the full structure 
was a particularly delicate part of the up- 
per palate, a bone called the pterygoid that 
is key to the jointed upper 
beak. The researchers, who 
describe the more complete 
fossil this week in Nature, 
argue the bird is a previ- 
ously unknown species and 
name it Janavis finalidens, 
for Janus, the Roman god 
of beginnings, endings, and 
transitions. It was a coastal 
flyer, plying the shallow seas 
that at the time covered 
what is now Belgium and the 
Netherlands, and weighed 
an estimated 1.5 kilograms— 
about the size of a gray heron. 

University of Texas, Austin, paleonto- 
logist Julia Clarke, who studies bird evolu- 
tion, calls the Janavis fossil “an important 
snapshot that shows the skull in a new way 
and adds to the evidence for what collec- 
tion of traits was present in the ancestors of 
modern birds.” 

Several skulls of its older Ichthyornis 
relative have been described in recent years 
with bones that suggested the bird’s upper 
palate might have been jointed, but the evi- 
dence was still fuzzy. Now, in the Janavis 
fossil, “the specific skull bone that materi- 
alized was the particular one we needed” 
to show the upper beak was flexible, Field 
says. Torres agrees. “It’s like a puzzle where 
that one piece was missing, and now we 
have it,’ he says. 
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THE ANCESTORS 


DNA from a medieval German cemetery opens a window 
on the history of today’s largest Jewish population 


By Andrew Curry 


NEWS 
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n a Sabbath Saturday in March 
1349, the Jewish community of 
Erfurt was wiped out in a po- 
grom. The archbishop of Mainz, 
who had granted Jews the right 
to live and work in the medieval 
German city, tried the pogrom’s 
ringleaders, local merchants and 
city council members who owed 
money to Jewish money lenders. One was 
executed and the rest exiled. The city’s 
Christian population, meanwhile, was 
forced to pay restitution. 

Five years later, a new Jewish community 
took root in the narrow, winding streets. Be- 
ginning in 1354, the city funded new houses 
and a synagogue, drawing Jews from across 
Europe to Erfurt. “That must have convinced 
them it would never happen again,” says 
Karin Sczech, an archaeologist who works for 
the city. 

For 100 years, Erfurt’s Jews flourished. 
They bathed in a ritual bath, or mikvah, 
on the banks of the Gera River and bur- 
ied their dead in a large cemetery just 
outside the city walls. Then it all came 
to an end, again. In 1454, the town coun- 
cil revoked the rights of Erfurt’s Jewish 
population, forcing them to leave town. 
The city built a granary on top of their 
cemetery, destroying hundreds of graves 
and repurposing Jewish tombstones to 
build its stout stone walls. 

On a sunny fall day, Sczech points out 
a 192-square-meter plot she and a team 
of other archaeologists excavated on the 
former cemetery’s grounds a decade ago. 
With a municipal construction project 
about to start on the site, their goal was 
to save, study, and rebury any human re- 
mains uncovered by the building work. The 
local Jewish community was closely involved, 
and “their wish was to do as little excavation 
as possible,’ Sczech says. 

A meter or two under the ground, in the 
shadow of the 500-year-old granary, the team 
found the remains of more than 60 people, 
almost all of them oriented with their legs 
pointed east—toward Jerusalem. Their skel- 
etons were well-preserved, along with traces 
of wooden coffins and nails. 

Before the remains were reburied last 
year at a nearby cemetery, they yielded a 
gift: DNA that is shedding light on the ori- 
gins of the Ashkenazim, the major Jewish 
population that emerged in Germany in the 
Middle Ages and later expanded into central 
and Eastern Europe. To- 
gether with a smaller 
scale study published in 
September that looked at 
DNA from six individu- 
als from the Middle Ages 
unearthed in Norwich, 


Tombstones from 
a medieval Jewish 
ceremony in Erfurt, 
Germany, are 
relics of a thriving 
community. 
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England, the Erfurt analysis offers clues to 
where the Ashkenazim came from centuries 
earlier, and what happened along the way. 
The studies also confirm other evidence that 
today’s Ashkenazi Jewish population, which 
numbers more than 10 million people spread 
around the world, has roots in a band of no 
more than a few hundred who survived a 
population bottleneck in Europe more than 
1000 years ago. 

The Erfurt study, which appears this week 
in Cell, is the first major study of a medieval 
Jewish population from a genetic perspec- 
tive. “We just got a really nice angle from ge- 
netic resources that we didn’t have before,” 
says Elisheva Baumgarten, a historian at the 
Hebrew University of Jerusalem who was not 
involved with the study. “The cumulative evi- 
dence can tell us more than history alone— 
that to me is the really exciting part.” 


An Orthodox rabbi approved plans to sample loose teeth, 
but not bones, from Jewish graves. 


Perhaps equally important, the archaeo- 
logical effort was made possible by a rabbin- 
ical ruling that may establish a precedent 
for future studies of ancient Jewish remains 
that yield precious insights without violat- 
ing religious sensibilities. 


ARCHAEOLOGISTS HAVE UNCOVERED evidence 
of Jewish communities in Germanic prov- 
inces of the Roman Empire as early as the 
300s C.E., particularly in what is today the 
city of Cologne. During the medieval period, 
a trio of German cities—Worms, Mainz, and 
Speyer—was known as the cradle of Ashke- 
nazi culture, with records of Jewish life go- 
ing back to about 900 C.E. 

But the period in between largely remains 
a mystery. Were the Jews of Erfurt and other 
medieval cities tenacious holdovers from 
the Roman era, as some have proposed? 
Or were they the descendants of more re- 
cent pioneers who crossed the Alps around 
800 C.E. to found tight-knit communities 
along the Rhine, near modern-day Frank- 
furt? “Ashkenazi Jews emerge in the Rhine- 


land as migrants,’ says Leonard Rutgers, a 
historian at Utrecht University and a co- 
author on the Cell paper. “But if they came 
from elsewhere, where did they come from?” 

To find out, geneticists have tried to 
work backward from modern DNA. Today’s 
Ashkenazi populations have high rates 
of certain genetic diseases because many 
individuals carry identical mutations, in- 
creasing the risks to their offspring. Those 
mutations, along with other shared DNA 
sequences, are clues to an early population 
bottleneck that drastically reduced Jewish 
genetic diversity. “Whether they’re from Is- 
rael or New York, the Ashkenazi population 
today is homogenous genetically,’ says He- 
brew University geneticist Shai Carmi. 

Some ultra-Orthodox Ashkenazi com- 
munities today regularly administer 
genetic compatibility tests during match- 
making to limit the risk that children 
will inherit genetic diseases, and pre- 
conception testing is common in other 
Ashkenazim. But even though modern 
Ashkenazi genomes have been closely 
scrutinized, they can’t give a clear pic- 
ture of events 1000 or more years ago. 
“It helps to have data from the past,” 
Carmi says. 

In 2017, Carmi met with Harvard 
University geneticist David Reich at 
his lab. Reich—an expert on ancient 
DNA whose research has been criticized 
in the past for not engaging meaning- 
fully enough with the concerns of local 
communities—encouraged him to pur- 
sue ancient DNA sampling, if he could 
find an ethical way to do it. Speak- 
ing from his Boston-area home on the 
morning of Rosh Hashana, the Jewish New 
Year, Reich said that although he is of Jew- 
ish descent, before Carmi reached out he 
had avoided studying Ashkenazi genetics. 
“Tt’s hard to study one’s own community,” he 
says. “Working on oneself is complicated— 
you have biases towards what the results 
would be that are cognitively difficult.” 

Carmi left that initial meeting feeling pes- 
simistic about studying DNA from ancient 
Jews. “I thought this would be impossible,” 
he says, “because there would be no permis- 
sion to sample.” Searching for ancient DNA 
for analysis would mean grinding up tiny 
bits of bone for sequencing. Destructive sam- 
pling would also be needed for radiocarbon 
dating. “It’s a hard rule in Judaism 
that you don’t disturb the dead,” says 
Alexander Nachama, chief rabbi of the Jew- 
ish Community of Thuringia and head of the 
modern-day Jewish community in Erfurt. 

Carmi pressed ahead, contacting histo- 
rians and archaeologists in Europe to see 
whether suitable samples existed. “The his- 
torians thought I was crazy,’ Carmi says. But 
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a few got back to him—including Sczech, 
who had received rabbinical permission to 
measure bones from the Erfurt cemetery to 
determine their sex and ages at death, tech- 
niques that don’t harm the skeletal material. 

Carmi also reached out to an Orthodox 
rabbi in Israel. After studying centuries-old 
interpretations of Jewish law and listen- 
ing to Carmi’s explanation of the science, 
the rabbi suggested a workaround: teeth. 
Because they fall out naturally (“They’re 
deciduous, like the leaves of a tree,” Reich 
says), the rabbi concluded that stray teeth 
are not part of the body in the same way 
as a skull or rib. Researchers couldn’t go 
looking for them, but if loose teeth turned 
up—as part of a rescue excavation, for 
example—they could be sampled without 
violating Jewish beliefs. 


tional Institute of Preventive Archaeologi- 
cal Research who was not involved with 
the study. “I hope that the results of this 
first study will show what is at stake and 
what benefits each side—scientists and the 
Orthodox—can expect.” 


NO SIMILAR DISCUSSIONS of religious eth- 
ics preceded the Norwich study, because it 
was only after an initial analysis in 2011 that 
the researchers realized the 17 people whose 
remains were found during construction 
work might have been Jews killed in a po- 
grom. The presence of DNA sequences also 
found in Ashkenazi Jews today was one clue. 
“When we first tested the DNA, we got just a 
few hundred base pairs,” says Ian Barnes, an 
evolutionary geneticist at London’s Natural 
History Museum who worked on the study. 


Medieval heartland 


The Ashkenazi Jewish population in medieval Europe was concentrated in 


central Germany; in Erfurt, a community alternately thrived and suffered 
persecution. During a period of calm in the late 14th century, Jewish homes 
and religious centers occupied the heart of the city, and a cemetery just outside 


its walls received remains that are now yielding clues to Ashkenazi origins. 
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With a rabbinical ruling in hand, Carmi 
approached Jewish authorities in Germany, 
including the chief rabbi of Erfurt at the 
time. They were hesitant. “They didn’t want 
to be the first to allow an ancient DNA study 
or go against an established ruling,” Carmi 
says. The Israeli rabbi’s reasoning ultimately 
won over Erfurt’s Jewish community, and 
Carmi was able to sample the loose teeth of 
38 individuals from the cemetery before the 
bodies were reburied in a 2021 ceremony. 

For colleagues working on Jewish sites 
elsewhere in Europe, the ruling represents 
a religiously acceptable way to apply scien- 
tific techniques to date Jewish remains and 
study their ancestry and diet. Carmi says 
it might even be applied to DNA analyses 
of mass graves from World War II. “The 
use of teeth seems to me to be an excel- 
lent compromise,” says Pierre Blanchard, 
an archaeologist working for France’s Na- 
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“The best I could say was they were compat- 
ible with Jewish origin.” 

The bones were reburied in 2013 in a 
multidenominational ceremony. Five years 
later, the local Jewish community asked to 
move them to a quieter spot. Before they 
were reburied a second time, the authors 
secured permission to analyze them again, 
this time yielding higher-resolution results 
that definitively linked them to modern Ash- 
kenazi populations. Subsequent radiocarbon 
dating, meanwhile, showed the bones were 
about 800 years old, placing them around 
the same time as an 1190 massacre of the 
town’s Jews, which took place on the eve 
of the First Crusade. “The minute you get 
dates, it seems reasonable it could be this 
one antisemitic event,’ Barnes says. “Put to- 
gether, everything helped to build the case.” 

The DNA results from Norwich and 
Erfurt both confirm that modern Ashke- 


nazim are descended from a small found- 
ing population. Based on modern Jewish 
DNA, some researchers had speculated 
this founder group emerged from a popula- 
tion crunch in the 13th and 14th centuries, 
when the religious fervor of the Crusades 
and false accusations that Jews spread the 
Black Death sparked violent pogroms. But 
the new data point to a different scenario 
that played out earlier. 

“We already see clear evidence for that 
bottleneck” in the 14th century teeth from 
Erfurt, Carmi says. Disease mutations and 
long stretches of identical genetic code in 
the medieval DNA implied the bottleneck 
occurred centuries earlier. One type of mi- 
tochondrial genetic material—DNA that 
is passed through the maternal line—was 
identical in one-third of the people in the 
excavated plot, evidence that they all de- 
scended from a single woman who probably 
lived 500 to 1000 years earlier. 

In the Norwich individuals—which DNA 
shows included three sisters and a young 
boy with red hair and blue eyes—geneticists 
found the same disease markers seen in 
modern Ashkenazi populations, at about the 
same frequencies. “That tells you the bottle- 
neck happened prior to these individuals 
dying in 1190,’ Barnes says. “This is a clear 
example of how powerful these data are.” 

At the same time, small differences among 
the dozens of Erfurt genomes suggested 
medieval Ashkenazi communities weren't 
completely homogeneous, despite the earlier 
bottleneck. In the city archives, co-author 
Maike Lammerhirt of the University of Er- 
furt found clues to that diversity in tax and 
property records. Some of the names of 14th 
century Jewish residents—Salman of Wiirz- 
burg, Abraham of Rothenburg—suggest 
family roots in Bavaria, south and west of 
Erfurt. Other prominent Erfurt Jews, such as 
Baruch of Pilsen or Jacob of Bohemia, appar- 
ently traced their ancestry far to the east, in 
one case as far away as modern-day Kalinin- 
grad, now in Russia, or beyond. 

That mixture of east and west “is exactly 
what we get from the genetic results,” Sczech 
says: After first branching out from a single, 
small founding population into small com- 
munities across Europe, including medieval 
Great Britain, the medieval Ashkenazim ap- 
parently mixed back together in places like 
Erfurt generations later. 

Isotopic signals in the teeth supported the 
DNA evidence. Tooth enamel captures the 
isotopic mixture of the water where a per- 
son grew up, and in the Erfurt cemetery the 
analysis showed parents with distant birth- 
places—probably in Eastern Europe—were 
buried near their children, who grew up lo- 
cally. “That’s a hint that we have the found- 
ers,’ Scezech says. “It’s the first time ever 
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The partial excavation of acemetery 
in 2013 to clear the way for a parking 
garage ramp exposed dozens of graves. 


we have all these results coming together.” 

By comparing the Erfurt genomes with 
modern and ancient DNA data from many 
different populations, the researchers were 
able to peer even further back, to the origins 
of those scattered European communities. 
The comparisons suggested the Ashkenazi 
circa 1350 had a mix of ancestry resembling 
populations from southern Italy or Sic- 
ily today, with components found in mod- 
ern Eastern Europe and the Middle East 
mixed in. “That fits the historical data,’ says 
Krishna Veeramah, a geneticist at Stony 
Brook University who was not involved in 
the work. 

One traditional tale about Ashkenazi roots 
may not be far from the truth: A family or 
small group of Jews arrived in Germany 
around 800 C.E., crossing the Alps at the in- 
vitation of Charlemagne, the first Holy Ro- 
man emperor, and settled in the Rhineland. 


BEYOND A LOOK at population dynamics, 
the Erfurt results offer a window into me- 
dieval Jewish culture. Although genetic 
variations show the postpogrom commu- 
nity drew its members from across Europe, 
their burials are all similar, reflecting com- 
mon traditions. “They were all considered 
Ashkenazi Jews,” Carmi says. 

The genetic similarity between the Er- 
furt community and modern Ashkenazim 
600 years later was also telling. “These peo- 
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ple lived about 25 generations ago, and an 
intermarriage rate with outsiders of more 
than one in 500 per generation would have 
shifted Ashkenazi ancestry by an amount 
we could detect,’ Reich says. “But that 
didn’t happen. That’s new information.” 

The Jewish community’s central position 
in medieval European life makes its genetic 
isolation even more remarkable. The role 
of Jews as bankers, craftspeople, traders, 
and money lenders would have put them in 
constant contact with their Christian neigh- 
bors. In Erfurt, as in many medieval cities, 
synagogues, ritual baths, and Jewish houses 
were in the heart of town, right next to the 
city hall and at the intersection of two ma- 
jor roads. Archival records show Jews and 
Christians went into business together and 
Christians served as wet-nurses to Jewish 
children. “Jews and Christians were con- 
stantly in each others’ lives. But it looks 
like they didn’t have children together,” 
Baumgarten says. “We as modern people 
don’t have the words to describe that com- 
plex sense of belonging.” 

Today Erfurt’s Jews are at the center of 
the city once more, as part of an effort to 
make its ancient synagogue and mikvah a 
UNESCO World Heritage Site dedicated to 
Jewish daily life in the Middle Ages. It’s a 
story Baumgarten says is often overshad- 
owed by the horrors of the Holocaust and 
the pogroms that punctuated centuries of 


co-existence. “To make it only into a dark 
story would be a mistake for European cul- 
ture at large,” she says. “The truth is the 
Jews of medieval Germany settled there by 
invitation, were welcomed there when they 
came and were integrated into medieval 
German space—and all the while were a re- 
ligious, and sometimes persecuted, other.” 

Now their genomes are proving central 
to the story of all Ashkenazim-including 
its formative event, the bottleneck that so 
dramatically marked the DNA of millions 
of people living today. Though these new 
studies help pin down its timing, no one 
knows what caused it. Bottlenecks call to 
mind catastrophes, such as massacres or 
discrimination that prevents people from 
marrying outside their community. “It’s fair 
to say Jewish history is one big sequence of 
bottlenecks,” Rutgers says. 

But to Reich, the Erfurt data suggest a 
brighter possibility: that long before the 
Erfurt Jews were laid to rest, somewhere 
in Europe a few dozen people flourished, 
passing their genes and culture to millions 
of people living today despite a history of 
brutal persecution. “Perhaps it’s the legacy 
of a small town that had a tradition of very 
large families, or maybe a few towns with 
very strong founder events,” Reich says. 
“Bottlenecks are often thought of as a cri- 
sis, but sometimes it’s a group that’s been 
incredibly successful.” & 
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What set Siberia ablaze? 


Shifting air currents and early snowmelt drove extreme Arctic fires from 2019 to 2021 


By Gabriela Schaepman-Strub! and 
Jin-Soo Kim? 


xtreme wildfires are being reported 
worldwide, contributing to global 
warming by emitting substantial 
amounts of carbon dioxide, destabiliz- 
ing ecosystems, and causing major so- 
cioeconomical damage. Boreal forests 
and Arctic tundra have experienced devas- 
tating fires in recent summers. From 2019 to 
2021, more than 90% of these fires occurred 
in central and eastern Siberia. On page 1005 
of this issue, Scholten et al. (J) report the 
underlying causes of these events. The main 
drivers include a northward shift of a major 
air current (the Arctic front jet stream), early 
snowmelt, and frequent lightning strikes. 
These conditions have led to a migration of 
fires that threaten even the northernmost 
part of the land, the Arctic coast. 
Fires are a naturally occurring ecological 
disturbance in boreal forests. However, their 
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frequency and the resulting burned areas 
have almost doubled in the recent decade 
compared with the previous decade (3.0 x 10° 
ha) in Siberia (2). Wildfires in North America 
have been connected to spatiotemporal pat- 
terns of the polar front jet stream that moves 
across the continent (3). Scholten et al. now 
find similar patterns for the recent extreme 
fire seasons in Siberia. The polar jet—the 
strongest wind in the upper atmosphere 
(troposphere)—blows west to east across 
Siberia and separates the air masses of the 
Arctic and the mid-latitudes of the Northern 
Hemisphere. However, this air stream has mi- 
grated northward. This has resulted in slower 
movement of air masses over Siberia, which 
stagnate for a long time over the region, a 
phenomenon known as “blocking.” The land 
surface consequently heats up, accelerating 
snowmelt. This also intensifies atmospheric 
convection (the rising of air), which increases 
lightning activity and the risk of fires. 
Anthropogenic greenhouse gas emissions 
and the consequential warming of Earth 
and its atmosphere have pushed the Arctic 
front jet stream northward (4). Unless emis- 
sions decrease rapidly, lightning activity in 
the Arctic will occur more frequently (5). 
The impacts of more frequent and intense 


Siberian boreal fires are wide-ranging, are 
of large magnitude, and affect all com- 
ponents of Earth’s systems—lithosphere 
(land), hydrosphere (water), atmosphere 
(air), and biosphere (life). Fires are an 
ecosystem disturbance. If fire frequency 
and intensity increase, vegetation might 
change in structure and composition, for 
example, from larch forests to deciduous 
broadleaf or nondeciduous needleleaf for- 
ests. Fires also affect local wildlife and live- 
lihoods. Indigenous people in Siberia have 
reported a northward shift of bears and 
wolves, driven by fires in the regions south 
of Siberia, as well as the loss of vast reindeer 
pastures (6). Even faraway ecosystems are 
affected, because wildfire aerosol deposi- 
tion may fertilize the Arctic Ocean and in- 
crease marine primary productivity (7). 
Siberian fires also “feed back” to Earth’s 
climate directly through burning highly or- 
ganic soils and contributing to at least one- 
third of the total global fire carbon emissions 
in 2021 (8). For example, after the organic soil 
layer and insulating mosses have burned off, 
the remaining charcoal that covers the per- 
mafrost absorbs most of the incoming solar 
radiation. The permafrost (ground that has 
continuously remained below 0°C for at least 
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2 years) becomes vulnerable to increased 
ground heat flux. Yedoma is a distinct type of 
ancient permafrost found across Siberia that 
was deposited during the late Pleistocene 
(between 129,000 and 11,700 years ago). This 
soil has accumulated carbon over thousands 
of years, and thawing releases stored carbon 
into the atmosphere for many years after a 
fire. Further, when the variability of biomass 
burning was included in modeling, the effect 
of fire-producing aerosols on cloud forma- 
tion led to more warming (9). Altogether, ex- 
treme fire events might push Arctic perma- 
frost systems to more rapidly reach a climate 
tipping point that, when crossed, will result 
in irreversible changes in the Arctic ecosys- 
tem and global climate (J0). 

Although climatological conditions are 
the basic ingredient for fires, they will be 
influenced by other natural and anthropo- 
genic factors. These include fire-soil and 
fire-vegetation interactions that can am- 
plify climatic effects on tundra fires. These 
interactions can also have a dampening ef- 
fect. For example, burnt areas in Alaska are 
expected to continue to increase because of 
climate warming, but areas that once had a 
fire have run out of fuel for about a decade, 
limiting fire spread (11). Conversely, the 
development of infrastructure, especially 
roads, may further increase the likelihood 
of ignition because of human activities 
such as campfires, electrical accidents from 
power lines, and cigarette butts. However, 
roads can also suppress wildfires by acting 
as physical barriers and by providing ac- 
cess points for firefighters (72). At present, 
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Tundra patchwork remains after fire moved through 
the Indigirka River lowland in Siberia in 2020. 


fires in Siberia are only actively suppressed 
when they threaten villages or infrastruc- 
ture. But anthropogenic fires could be re- 
duced through policies that strengthen pen- 
alties for arsonists, and ecosystem damage 
could be suppressed by better fire manage- 
ment (13). How to manage fires in protected 
areas remains an area of research across the 
globe, because more frequent and extreme 
fires reduce the fuel available for future 
fires. Hence, if fires are prevented, fire fuel 
builds up. But a future fire could be much 
stronger and destroy trees that are normally 
fire resistant under weaker fire conditions. 
There is an ongoing debate about whether 
it is safer to let fires burn if they are not too 
big to prevent very large fires in the future. 
Predicting how the shift in the polar 
front jet and earlier snowmelt develop un- 
der future climate warming, and how this 
will affect fire occurrence, requires more 
sophisticated Earth system modeling. 
Lightning-driven fire ignitions in high lati- 
tudes, for example, are not yet adequately 
represented in such models. Fires might 
limit Arctic shrub expansion (/4), but 
this element is not included in large-scale 
models that predict future tundra vegeta- 
tion (15). Ground measurements that are 
important for modeling, but have not yet 
been made, include determining the depth 
of soil in Siberia that has been burned; the 
amount of carbon released to the atmo- 
sphere, groundwater, and rivers; vegetation 
growth after fire; and microclimate effects. 
Until such measurements are available, sci- 
entists can use satellite data to better un- 
derstand fire events, climatological causes, 
and their consequences from a distance. ® 
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What lies 
beneath 
Yellowstone? 


There is more magma than 
previously recognized, 
but it may not be eruptible 


By Kari M. Cooper 


hat controls how and when a vol- 
cano will erupt? Despite some no- 
table successes (J, 2), forecasting 
volcanic eruptions remains a chal- 
lenge—not least because there is 
no way to directly see what is hap- 
pening beneath volcanoes. Instead, indirect 
methods are used to glimpse conditions 
below the surface. An obvious but key re- 
quirement for an eruption is the presence 
of magma (molten rock, consisting of vari- 
able proportions of liquid, solid crystals, 
and volatiles). This magma also needs to 
be distributed so that it can mobilize and 
erupt as a coherent body. Therefore, a key 
issue for eruption hazard assessment is to 
ascertain how much magma is below the 
surface and where. On page 1001 of this is- 
sue, Maguire et al. (3) modeled seismic data 
to image melt (the liquid part of magma) 
beneath the Yellowstone Caldera. They con- 
clude that more melt is present than had 
been recognized, and it is located at shallow 
depths in the crust. 
What is known about conditions within 
a magma reservoir relies on three main 
approaches: geophysical imaging (such as 
that presented by Maguire et al.), numeri- 
cal modeling (4-6), and inferences based 
on analyses of the erupted products (7-9). 
Collectively, this kind of work has led to 
a revolution in ideas about what magma 
storage looks like. Early models envisioned 
a large, mostly liquid magma body that 
persisted beneath volcanoes for tens or 
hundreds of thousands of years [the “big 
tank” model (J0)]. However, a large body 
of evidence from more recent studies indi- 
cate that magma storage regions are much 
more complex. At any given time, much of 
the magma reservoir is likely dominated 
by solid material rather than liquid [the 
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mush model (1/7, 12)] and may be present 
at relatively cool temperatures for most of 
the reservoir’s history (8). Reservoirs also 
involve storage at multiple depths in the 
crust [“transcrustal magma _ reservoirs” 
(13)]. Accumulations of liquid-rich magma 
in the upper-crustal reservoir are likely 
short lived, with large eruptions often tap- 
ping multiple distinct stored magma bod- 
ies (7, 14) that are present simultaneously 
as separate “pods” of liquid-dominated 
magma within the larger crystal-dominated 
mush system. 

Thus, ascertaining where the magma is 
has two parts: the range of depths below the 
surface where magmas are stored, and how 
the magma is distributed. Maguire et al. 
found that the largest concentration of melt 
lies at 3 to 8 km below the surface. Despite 
the improved spatial resolution of their ap- 
proach, Maguire et al. cannot rule out the 
presence of liquid-rich magma bodies of a 
size that could produce lava flows similar 
to those that erupted since the most recent 
caldera-forming eruption at Yellowstone. 
Moreover, “how much” also depends on the 
distribution of melts; the amount of melt 
needed to account for the slowest seismic 
wave speeds in the geophysical data they 
analyzed ranges from ~10 to 20% depend- 
ing on whether the melts are concentrated 
into pods or spread out as thin films be- 
tween mineral grains. 

The eruptive and hazard potential also 
depend in part on how the melt is distrib- 
uted. It is easier to mobilize liquid-domi- 
nated bodies than to disrupt and mobilize 
a crystal mush, although both possibili- 
ties can occur (5, 6, 15). The configuration 
of melts within a mushy magma reservoir 
undoubtedly encompasses a spectrum rang- 
ing between thin films (millimeter scale) 
to small melt segregations (centimeter 
to meter scale) and larger melt-rich pods 
(hundreds of meters to kilometer scale) (see 
the figure). Determining which distribution 
exists in which part of the reservoir at any 
given time requires bringing in information 
from geochemical and petrological studies. 
The observation of Maguire et al. that the 
melt is focused within the shallow crust is 
consistent with depth estimates from stud- 
ies of past eruptions, and therefore these 
petrological insights into the past can be 
used as a window to understand what 
might be in the reservoir today. 

In order to do this, the temptation to over- 
generalize from a single sample or from a 
few analyses should be resisted. Such an ap- 
proach reflects the “big tank” era, in which 
magma reservoirs were envisioned as a 
single body and thus a single sample might 
be informative. For example, examination 
of analyses of many mineral grains shows 
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Melt distribution within a magma reservoir 
Liquid magma (melt) is distributed at different scales in the reservoir beneath the Yellowstone Caldera. 
Mobilizing melts to produce an eruption is much more difficult if melts are dominantly present as small 


segregations or thin films. 


1 Melt-dominated 


. ac pods within the larger 
crystal mush reservoir, 
@ Melt distributed mainly 
in the center of the 
reservoir 
Okm 


that even individual football-sized samples 
of eruptions contain minerals that formed 
from melts with widely varying composi- 
tion and were stored at differing tempera- 
tures at the same time. Such variations are 
difficult to reconcile with a big tank model, 
which would predict that minerals would 
crystallize from a unform magma body and 
therefore would be compositionally and 
thermally homogenous at any given time. 
To move forward, it is necessary to collec- 
tively move past the idea that any single 
component of a magma body is representa- 
tive of the whole, instead recognizing that 
each captures distinct information about a 
different part of the same complex system. 

In the future, there is potential to learn 
about magma reservoirs by using multiple 
different approaches within the same volca- 
nic system. Geochemical studies can provide 
insights into melt distribution in the past by 
understanding how much variability of com- 
position (and therefore how many distinct 
magma bodies) are sampled by single erup- 
tions. With enough studies that combine 
the present-day view from seismology with 
the historical perspective from petrology in 
enough volcanic systems, a seismic snapshot 
of a magma reservoir might be captured just 
before an eruption. This could be compared 
with the record of past processes contained 
in deposits from the same eruption and re- 
sults from numerical models to understand 
the physical processes that led to an erup- 
tion. In turn, this would provide the context 
needed to interpret seismic snapshots and 
the signals that precede eruptions in terms 
of the actual physical and chemical processes 
occurring in a magma reservoir. 

The results of Maguire et al. do not in- 
dicate that an eruption is more likely than 


2 Small segregations 
of melt at centimeter to 
meter scale occur 
within a crystal mush. 


3 Thin films of 
melt along crystal 
boundaries, likely 
dominant near 
the boundaries of 
the reservoir 


previously thought; the uncertainty about 
how melts are distributed means that 
more melt is not necessarily more hazard- 
ous, and continuous and careful monitor- 
ing of the volcano by the United States 
Geological Survey (USGS) and partners in 
the Yellowstone Volcano Observatory has 
shown no indication of unusual activity. 
However, with concerted and targeted ef- 
forts at understanding processes beneath 
the surface that lead to eruptions, together 
with monitoring what volcanoes are doing 
today, the ambition of understanding what 
makes volcanoes do what they do is com- 
ing closer. & 
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Getting tougher in the ultracold 


Certain alloys show exceptional toughness in a liquid helium environment 


By Peng Zhang and Zhe-Feng Zhang 


he disastrous accident of the Titanic 
in 1912 occurred when the British 
ocean liner struck an iceberg and 
sank. Insufficient fracture resistance 
of the ship’s steel at low temperature 
ensured the ship’s demise (7). There- 
after, selection of materials with excellent 
cryogenic toughness became an important 
prerequisite for low-temperature, load- 
bearing applications. Despite progress in 
the understanding of fracture mechanics, 
most metallic materials show re- 
duced toughness with decreas- 
ing temperatures (2), especially 
in the realm of the temperature 
of liquid helium (-269°C; 4.15 
K). Thus, searching for tough al- 
loys in cryogenic temperatures 
has remained a challenge. On 
page 978 of this issue, Liu et 
al. (3) report that certain alloys 
containing the metals chromium 
(Cr), cobalt (Co), and nickel 
(Ni) show exceptional fracture 
toughness at 20 K and attribute 
this property to a sequence of 
deformation mechanisms. These 
metal mixtures could potentially 
be used for applications in espe- 
cially low temperatures, such as 
deep-space exploration (4). 
Medium- and high-entropy al- 
loys are classes of metallic ma- 
terials that have three or more 
constituents in equal amounts. 
Those designed with Cr, Co, and Ni as 
principal elements display high tolerance 
to damage, which has triggered a hunt for 
CrCoNi-based alloys that can withstand 
extreme environments, such as very low 
temperatures. But designing alloys with 
low-temperature toughness stems from 
understanding the propagation of cracks 
(fracture properties) and mechanisms that 
underlie a material’s resistance to fracture 
(5)—that is, its toughness. For example, 
metals with face-centered cubic (fcc) struc- 
ture usually exhibit excellent toughness at 
low temperature (6). Unfortunately, when 
the temperature reaches that of the liquid 


Institute of Metal Research, Chinese Academy 
of Sciences, Shenyang 110016, China. 
Email: pengzhang@imr.ac.cn; zhfzhang@imr.ac.cn 


SCIENCE science.org 


nitrogen environment (reportedly between 
63 and 77 K), the resistance to fracturing 
of most fcc alloys decreases. Clearly, fcc 
structure alone cannot guarantee low- 
temperature toughness. Another important 
property of fcc alloys is the formation of 
faults in the normal planar stacking of at- 
oms in a crystal. This occurs by decreasing 
the stacking-fault energy. Such irregulari- 
ties carry an energy cost because of strain. 
Stacking-fault energy inversely scales with 
the amount of mechanical stress needed 
to deform the crystal lattice (so-called me- 


Z 


Fractography reveals excellent ductile fracture characteristics (dimples) 
in the high-entropy alloy CrCoNi at ultralow temperature (20 K). 


chanical twinning). Steel comprising iron 
(Fe), manganese (Mn), and carbon (C), for 
example, shows increasing toughness at 
low temperature when the stacking-fault 
energy decreases (7). 

Liu et al. explored the mechanisms un- 
derlying the toughness of CrCoNi-based al- 
loys at 20 K (experiments were conducted 
in a liquid helium environment, but the 
alloy surface only reached 20 K). The au- 
thors measured and observed behaviors 
such as crack initiation, deformation, and 
fracture to determine how these metallic 
materials display increasing resistance to 
fracture as the temperature decreases, in 
contrast to most alloys. The authors ob- 
served that a low stacking-fault energy 
promotes changes in deformation mecha- 
nisms under high stress that account for 


fracture toughness. A sequence of such 
mechanisms simultaneously increases the 
strength and toughness of CrCoNi-based 
alloys at cryogenic temperatures (8). 

The findings of Liu et al. support the no- 
tion that there are three central principles 
for optimizing the low-temperature tough- 
ness of metallic materials. These principles 
follow those for improving the strength and 
plasticity of medium-entropy alloys—that 
is, optimizing resistance to stress that a ma- 
terial can bear before deforming (strength) 
and the ability to deform under stress with- 
out breaking (plasticity). One of 
these principles regards achiev- 
ing a high elastic modulus to in- 
hibit brittle fracture. For brittle 
materials, such as glass and ce- 
ramics, a crack will rupture sur- 
faces with weak atomic binding 
force and lead to catastrophic 
fracture (5). Conversely, for duc- 
tile alloys, a crack will be blunted 
by emission of dislocations (that 
is, a crystallographic defect 
within the structure) and/or in- 
creasing mechanical twinning. 
The latter consumes energy that 
is dissipated as the microstruc- 
ture changes—a process known 
as plastic work. Therefore, the 
key to improving toughness of 
materials is to inhibit the oc- 
currence of brittle fracture and 
improve the plastic work in the 
vicinity of the crack tip (the point 
at which a crack propagates). 

Among the three principles (8), increas- 
ing the elastic modulus can increase the 
load required for fracture. This, in turn, in- 
hibits propagation of a crack, resulting in 
high fracture toughness. For magnesium, 
aluminum, copper, and titanium alloys and 
steels, fracture toughness increases almost 
synchronously with their elastic modulus 
(3, 9). The CrCoNi-based medium- and 
high-entropy alloys examined by Liu e¢ al. 
have a modulus of >200 GPa, similar to 
that of steel (10), which accounts for their 
high fracture toughness. 

Another design principle is reducing 
the stacking-fault energy to improve the 
plastic work (8). Reducing this energy can 
simultaneously improve the strength and 
ductility of copper alloys (17), FeMn-based 
steels (72), and various fcc alloys (13). Be- 
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cause plastic work can be estimated as the 
product of strength and plasticity, reduc- 
ing the stacking-fault energy can improve 
the plastic work. Indeed, Liu et al. lever- 
aged the stacking-fault energy effect to 
create a CrCoNi-based high-entropy alloy 
to achieve excellent low-temperature frac- 
ture toughness. 

The third design principle relates to fcc 
structural stability (8). A metallic material 
with this structure may undergo crystal 
structure transition (phase transforma- 
tion) during deformation at low tempera- 
tures. It is generally believed that this 
transition will form a brittle structure that 
is harmful to the toughening of materials. 
However, not all metals with such trans- 
formed structures are brittle (74), and the 
energy generated with the phase transfor- 
mation can improve toughness. Therefore, 
the key to controlling fcc stability is ensur- 
ing that the plastic work will not decrease 
as a result of phase transformation but will 
instead inhibit phase transformation (8). 
This conclusion is also verified by Liu e¢ al. 

Machine learning, rather than endless 
experimental searching (75), is expected to 
accelerate the material design process for 
cryogenic applications. Indeed, elastic mod- 
ulus, stacking-fault energy, and fcc phase 
stability could be used in machine learning 
programs. These three characteristics can 
be obtained not only through database que- 
ries but also through high-throughput ex- 
periments and first-principles calculations 
(8). Machine learning combined with bet- 
ter understanding of physical mechanisms 
should create enormous opportunities for 
materials discovery. 
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Active DNA demethylation 


damages DNA 


Active DNA demethylation maintains enhancer activity 
in nonproliferating cells but can damage DNA 


By Isaac F. Loépez-Moyado?? and Anjana Rao? 


n mammalian genomes, cytosine resi- 

dues can be covalently modified by ad- 

dition of a methyl group. DNA methyla- 

tion levels at gene regulatory regions 

change as cells respond to stimuli, but 

whether this is a cause or consequence 
of changes in gene expression is unknown. 
On page 983 of this issue, Wang et al. (1) 
show that active DNA demethylation is 
necessary for enhancer activation and lin- 
eage specification in postmitotic neurons 
and macrophages. 

DNA demethylation involves 5-methyl- 
cytosine (5mC) oxidation by the ten-eleven 
translocation (TET) methylcytosine dioxy- 
genases. Most investigations of demethyl- 
ation have examined proliferating cells, in 
which the oxidized methylcytosines 5-hy- 
droxymethylcytosine (5hmC), 5-formylcy- 
tosine (5fC), and 5-carboxylcytosine (5caC) 
facilitate “passive,” replication-dependent 
DNA demethylation. The hemi-modified 
DNA strands containing these bases are 
poorly recognized by DNA methyltransfer- 
ase 1 (DNMT1), and therefore methylation 
is progressively lost. A different process of 
“active” (replication-independent) DNA de- 
methylation involves thymine DNA glyco- 
sylase (TDG), which excises T:G mismatches 


but also excises the oxidized methylcytosines 
5fC and 5caC when normally base-paired to 
G (see the figure). 

In postmitotic cells such as fully differenti- 
ated neurons, DNA demethylation is replica- 
tion independent. Neuronal cells have high 
levels of 5hmC (2, 3), and postmitotic mouse 
Purkinje neurons require TET proteins and 
oxidized methylcytosines for DNA demeth- 
ylation of select gene regulatory regions 
(3). Additionally, in primary neurons and 
in “iNeurons” derived from human induced 
pluripotent stem cells (iPSCs), regions that 
regulate gene expression, such as enhanc- 
ers, accumulate single-stranded DNA breaks 
(SSBs) (4, 5). These SSBs occurred largely at 
methylcytosine residues at which TET was 
active (4). Wang et al. investigated whether 
TET-TDG-mediated 5fC and 5caC excision 
might be the source of these SSBs. 

SSBs resulting from 5fC and 5caC excision 
can be repaired by two distinct processes 
of base excision repair (BER): short-patch 
repair by DNA polymerase B (Pol 8), which 
replaces the single missing nucleotide with 
unmodified C; and long-patch repair, in 
which Pol ¢ and Pol 6 synthesize 2 to 30 nu- 
cleotides using the undamaged strand as a 
template. Wang et al. showed that iNeurons 
deficient for TDG accumulated far fewer 
SSBs than iNeurons with active TDG. SSBs 


Programmed breaks in active DNA demethylation 

CpG dinucleotides are methylated by DNA methyltransferases (DNMTs). During active DNA demethylation, 
ten-eleven translocation (TET) enzymes oxidize 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (ShmC), 
5-formylcytosine (5fC), and 5-carboxylcytosine (ScaC). 5fC and 5caC are removed by thymine DNA glycosylase 
(TDG), which generates a single-stranded DNA break (SSB). This is repaired by short-patch or long-patch base 
excision repair (BER). Incorporation of chain-terminating cytidine analogs [such as the chemotherapeutic 
arabinosylcytosine (Ara-C)] blocks BER and results in the accumulation of DNA damage, leading to cell death. 
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were similarly TDG dependent in mouse 
“{Macrophages” reprogrammed from pre- 
cursor B cells (pre-B cells), although differ- 
ent strategies were needed because iNeurons 
preferentially use long-patch BER, whereas 
iMacrophages use almost exclusively short- 
patch BER. This indicates that the SSBs at 
enhancers are caused by TDG. 

Because the highly coordinated activity of 
the TET-TDG-BER machinery rapidly repairs 
any SSBs that arise during active DNA de- 
methylation (6), Wang et al. used a mixture 
of chain-terminating dideoxynucleosides to 
optimally detect SSBs associated with active 
DNA demethylation in iNeurons and iMac- 
rophages. Similar nucleoside analogs, such 
as arabinosylcytosine (Ara-C, also called cy- 
tarabine), are commonly used as anticancer 
chemotherapeutics to inhibit DNA replica- 
tion. When used at high doses, these drugs 
can have neurotoxic side effects (7). Wang et 
al. tested the hypothesis that SSBs produced 
by active DNA demethylation would facili- 
tate the addition of cytidine analogs such as 
Ara-C, which would be incorporated in place 
of cytosines during long-patch BER and so 
would induce DNA damage at enhancers. 
Indeed, Wang et al. found that treatment of 
iNeurons with Ara-C in culture triggered a 
robust DNA damage response and resulted 
in cell death, but notably only when TDG 
was present in these cells. These results dem- 
onstrate that TDG can render cells suscep- 
tible to further DNA damage in the presence 
of nucleoside analogs, and call attention to 
the potential of TDG inhibitors in prevent- 
ing the neuronal cell death that is associated 
with chemotherapy. 

Although the study of Wang et al. estab- 
lishes the role of TDG in generating SSBs, 
the importance of active DNA demethylation 
in promoting enhancer function is less clear. 
The transcriptional consequences of TDG 
deficiency appear to be cell-type and context 
specific. TDG is essential for reprogram- 
ming mouse embryonic fibroblasts to iPSCs 
(8), and germline TDG deficiency results in 
mouse embryonic lethality (9, 10). But TDG 
deficiency has little effect on demethylation 
of the paternal genome in mouse zygotes 
(11), and adult mice with conditional dele- 
tion of Tdg in all tissues show normal T 
cell differentiation and hematopoiesis (12). 
Wang et al. showed that TDG affects only a 
subset of iMacrophage functions: TDG was 
not required for pre-B to iMacrophage repro- 
gramming but was necessary for enhancer 
activity and consequent increased expres- 
sion of genes involved in iMacrophage func- 
tion, including CSFIR (macrophage colony- 
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stimulating factor 1 receptor) and the FIRE 
(fms-intronic regulatory element) super- 
enhancer, which controls CSFIR expression 
(13). Moreover, iMacrophages in which TDG 
was deleted were unable to phagocytose bac- 
teria, suggesting that enhancers required 
for this process are normally demethylated 
by TDG. Presumably, enhancers sensitive to 
TDG depletion contain CpG dinucleotides 
that if methylated, would interfere with the 
binding of key transcription factors neces- 
sary for enhancer function. 

The study of Wang et al. reinforces the 
idea that DNA demethylation stemming 
from TET-TDG activity can be essential 
for enhancer function in nonproliferating 
cells. This may also be the case in other cell 
types because in every cell type studied, the 
most active enhancers are also the most 
highly enriched with 5hmC. Moreover, the 
study establishes that TDG activity can be 
a source of programmed breaks (SSBs) in 
the cell. Although the active DNA demethyl- 
ation machinery is quite efficient, SSBs can 
pose a threat for cells when misrepaired— 
for instance, in the presence of Ara-C. It is 
also possible that high levels of DNA repair 
synthesis through long-patch BER make 
long-lived neurons susceptible to mutagen- 
esis, and this mechanism may explain why 
mutations in genes encoding SSB repair en- 
zymes are common in several neurodegen- 
erative diseases (14). Indeed, recent single- 
cell sequencing of human neurons revealed 
an age-dependent increase in mutations 
enriched at DNA repair hotspots associated 
with enhancers (15), implying that TDG 
activity might be involved. Future studies 
should establish how TET-TDG activity, 
active DNA demethylation, and associated 
SSBs at enhancers affect neuronal function 
during development, aging, and the onset 
of neurological diseases. 
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Triggering rare 


HIV antibodies 
by vaccination 


Clinical trial shows that an 
HIV vaccine can elicit 

rare antibodies, but there 
is more to do 


By Penny L. Moore! 


n HIV vaccine is urgent: A recent 

UNAIDS report entitled “In Danger” 

showed that in 2021, one adolescent 

girl or young woman became infected 

with HIV every 2 min, especially in 

sub-Saharan Africa (7). A vaccine will 
likely need to elicit broadly neutralizing anti- 
bodies (bnAbs), which are able to recognize 
globally diverse HIV strains and can prevent 
HIV infection (2). However, triggering bnAbs 
by vaccination has proven impossible so far. 
A key challenge is that bnAbs rarely develop, 
even during infection. Furthermore, bnAb 
precursors (or germlines) are uncommon in 
human immunological repertoires. Devising 
vaccines specifically aimed at recruiting 
these rare precursors, an approach referred 
to as “germline targeting,” has driven major 
immunological advances with broad appli- 
cability for other pathogens. On page 964 of 
this issue, Leggat et al. (3) provide clinical 
proof of concept for the germline-targeting 
approach for HIV vaccination and detailed 
immunological insights upon which future 
vaccine trials can be designed. 

HIV bnAbs target multiple epitopes on the 
envelope protein. One epitope, the CD4 bind- 
ing site (CD4bs) has been a particular focus 
for germline targeting because some bnAbs 
to this site, called VRCO1-class bnAbs [named 
for their prototype, VRCO1, a monoclonal an- 
tibody isolated from an individual a decade 
ago (4)] have a highly constrained angle of 
approach to their epitope (5). This translates 
to very limited genetic options with which 
the otherwise vast immunoglobulin (anti- 
body) repertoire can tackle this epitope (6). 
This focused antibody footprint was highly 
amenable to the design of the germline- 
targeting immunogen eOD-GT8, which is 
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a self-assembling nanoparticle that presents 
60 copies of an engineered HIV envelope pro- 
tein bearing mutations designed to enhance 
its affinity for VRCO1-class precursors (7-9). 

The preclinical evidence supporting eOD- 
GT8 as a vaccine immunogen was compel- 
ling, with preclinical studies confirming 
that the vaccine bound to rare VRCO1-like 
precursors (9, 10). In mouse models carefully 
chosen to replicate the scarcity of VRCO1-like 
precursors in humans (many mouse models 
overexpress antibodies of interest to vac- 
cine designers, giving biased readouts when 
extrapolated to humans), eOD-GT8 showed 
good priming or triggering of VRCO1-class 
antibodies (17). 

The robust preclinical data provided a 
strong foundation for moving eOD-GTS8 into 
a phase 1 clinical trial of 48 individuals, as re- 
ported by Leggat et al., where the safety pro- 
file proved favorable. But measuring immune 
responses to ECOD-GT8 was more complicated 
than in most phase 1 vaccine trials. Vaccine 
trials generally rely on serological readouts 
of antibody responses. However, the criti- 
cal readout for how well eOD-GT8 worked 
to prime antibodies was a genetic readout. 
VRCOl-class bnAbs have a specific genetic 
“signature,” defined as the use of heavy chain 
variable gene alleles VH1-2*02 or *04 com- 
bined with any light chain complementar- 
ity determining region 3 that has a length 
of five amino acids. This precise genetic sig- 
nature enables their numbers to be tracked 
after vaccination. Sampling of blood and 
tissue from vaccine and placebo recipients 
and sequencing of immunoglobulin genes 
showed that eOD-GT8 consistently primed 
a substantial increase in VRCO1-class B cells, 
which were barely detectable before vaccina- 
tion (see the figure). However, it is important 
to note that this reliance on sequencing ap- 
proaches reflects that these are, at present, 
low-frequency antibody responses that are 
not yet detectable by traditional serological 
methods. Further work remains to be done to 
amplify these responses to higher concentra- 
tions, which are more likely to be associated 
with protection from infection. 

Importantly, the VRCOl-class antibodies 
triggered by eOD-GTS8 vaccination also be- 
gan their journey toward becoming bnAbs. 
bnAbs undergo a period of maturation where 
they acquire the ability to bind multiple HIV 
strains. The germline antibodies must ac- 
quire a very specific set of mutations (well 
defined from studies of CD4bs antibodies 
in HIV infection) that enables these precur- 
sors to become bnAbs. Sequencing of plasma 
cells and lymph node cells from vaccinated 
individuals in this trial showed that eOD- 
GT8-triggered antibodies were accumulat- 
ing some of the correct mutations and ma- 
turing in the direction of bnAbs—but not 
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far enough. Encouraging further mutation 
through vaccination is the next challenge for 
HIV vaccine design. Many possibilities are be- 
ing explored in preclinical and human exper- 
imental trials that are aimed at accelerating 
HIV vaccine discovery. Most are based on the 
idea of sequential administration of varying 
vaccines to guide the maturation of breadth 
(72). A key question that needs to be tested 
experimentally is how large an antigenic leap 
these primed antibodies can tolerate without 
immunologically “dropping out” of the vac- 
cination process. Thus, if a booster shot is too 
different from the previous vaccine, antibod- 
ies that have been triggered by the first vac- 
cination may not recognize the booster and 
will not mature further. However, the incor- 
poration of many different shots into an HIV 
vaccine regimen is unappealing. Getting the 
balance right between the need for antibody 
maturation toward bnAbs and feasibility in 
the real world will be essential. 


Germline targeting 

Rare precursors of HIV broadly neutralizing 
antibodies (bnAbs) were elicited from vaccination 
with the nanoparticle eOD-GT8. However, for 
these antibodies to become protective against 
HIV infection, future vaccines will need to 

drive further expansion and greater numbers 

of antibody mutations. 


The HIV envelope 
protein contains the 
CD4 binding site, 
which can naturally 
elicit bnAbs. 


Ananoparticle, eOD-GT8, 
that presents 60 copies 
of part of the envelope 
protein was used as a 
vaccine immunogen in 
a phase 1 clinical trial 
with 48 people. 


Genetic sequencing of 
immunoglobulin genes 
that encode the light 
chain complementarity 
determining region 3 
(LCDR3) and heavy chain 
variable region revealed 
that rare precursors 
developed with some 
mutations that are typical 
of bnAbs. 


Additional vaccines expand 
the precursors? 


To convert precursors to 
HIV bnAbs, maturation, likely 
through boosters, will drive 
clonal expansion and the 
acquisition of mutations that 
are typical of bnAbs. 


Germline targeting will only work if 
people have the germline inmunoglobulin 
alleles that are being targeted. For VRCO1- 
like antibodies, most people appear to have 
the required heavy chain alleles, although 
Leggat et al. identified 1 in 48 donors who 
did not and thus failed to respond to eOD- 
GTS8 vaccination. A key caveat is that much 
of what is known about human immunoge- 
netics is limited to Caucasian populations 
(13). There is an urgent need to expand such 
studies to include people with diverse an- 
cestries, particularly those most burdened 
by infectious diseases. 

The clinical trial reported by Leggat et al. 
provides persuasive human data supporting 
the concept of germline targeting. However, 
VRCOl-class antibodies can be considered 
low-hanging fruit for this approach owing to 
the highly conserved genetics of their precur- 
sors. bnAbs to other HIV epitopes are much 
more promiscuous in their approach to epi- 
tope binding (12). This has made it harder 
to engineer vaccines equivalent to eOD-GT8 
for other HIV bnAbs, although several can- 
didates are being tested (12). Public (shared) 
antibody responses to other pathogens such 
as influenza virus and severe acute respira- 
tory syndrome coronavirus 2 (SARS-CoV-2) 
are being increasingly identified, and germ- 
line targeting for these responses will surely 
gain traction with these clinical data. This 
trial is an essential step toward the develop- 
ment of a preventative vaccine that is able to 
trigger VRCOI-class CD4bs bnAbs. In paral- 
lel, ongoing efforts are being made to trigger 
bnAbs to other HIV epitopes. An HIV vaccine 
will need to pull together all these research 
strands for a multipronged, polyfunctional 
approach. This will need sustained collabo- 
ration and investment in the basic immuno- 
logical research that resulted in this seminal 
proof of concept in humans. 
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Legal reform to enhance global 
text and data mining research 


Outdated copyright laws around the world hinder research 
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esearchers engaged in text and data 
mining (TDM) research collect vast 
amounts of digitized material and 
use software to analyze and extract 
information from it. TDM is a crucial 
first step to many machine learning, 
digital humanities, and social science ap- 
plications, addressing some of the world’s 
greatest scientific and societal challenges, 
from predicting and tracking COVID-19 
to battling hate speech and disinforma- 
tion (J-3). Although applications of TDM 
often occur across borders, with research- 
ers, subjects, and materials in more than 
one country, a patchwork of copyright laws 
across jurisdictions limits where and how 
TDM research can occur. With the World 
Intellectual Property Organization (WIPO) 
Standing Committee on Copyright and 
Related Rights, and legislatures around the 
world, deliberating the harmonization of 
copyright exceptions for various research 
uses, we discuss policy measures that can 
ensure that TDM research is unambigu- 
ously authorized under copyright law. 
Most text, images, and other materi- 
als that TDM researchers use are subject 
to copyright law. Copyright law gives the 
owner of a protected work the legal right to 
prohibit reproduction, distribution, modi- 
fication, and other forms of exploitation of 
that work without the owner’s permission. 
These rights apply even if the material is 
readily accessible—for example, published 


on the internet or available in a library. 

The justifications for copyright are 
grounded both in the rights of individual 
authors in their creations and in instru- 
mentalist incentives for the creation and 
dissemination of new works. Copyright 
thus gives the author of a work, or the au- 
thor’s assignee (e.g., a publisher), the exclu- 
sive right to reproduce, transmit, and make 
derivatives of the protected work and to 
prevent the unauthorized appropriation of 
these rights (4). Although copyright origi- 
nally subsisted solely in textual works of 
authorship, today it has expanded to cover 
graphical and visual works and, in some 
countries, data and databases [though 
some countries, including member states 
of the European Union (EU), have separate 
statutes protecting databases]. 

Each stage of a TDM project is poten- 
tially constrained by copyright depending 
on how the scope of protection is inter- 
preted. Copyright prohibitions on unli- 
censed reproduction may be implicated 
when sources are digitized, formatted, and 
compiled into a corpus that can be mined 
for analysis. Copyright may also limit the 
application of an algorithm to a TDM cor- 
pus, which may make additional temporary 
copies in computer memory. Copyright re- 
strictions on transmitting and reproducing 
works may be implicated when research- 
ers collaborate, when examiners validate, 
and when publishers report results. Thus, 
without copyright permission, or the appli- 
cation of exceptions under copyright law, 
much of the world’s copyrighted material 
may be off limits to TDM use. 

Some publishers make limited copyright 
licenses available for TDM uses, often for 
additional fees charged to libraries or re- 
searchers. But paid licensing is not an af- 
fordable or viable option for many critical 


TDM projects. TDM research often re- 
quires use of massive datasets with works 
from many publishers, including copyright 
owners that cannot be identified or are un- 
willing to grant licenses. Forcing research- 
ers to use only licensed or public domain 
content (i.e., content in which there is no 
enforceable copyright) can restrict topics 
of study, hamper reproducibility and vali- 
dation (5), bias results (6), and dissuade re- 
searchers from undertaking projects (7). A 
lack of a license need not, and should not, 
be an absolute barrier to TDM research. 

The rights granted by copyright are not 
absolute. All international copyright trea- 
ties permit, and all countries have, excep- 
tions from copyright protection for various 
purposes, some of which may authorize 
TDM research. In the US, for example, a 
flexible exception exists for “fair use” for 
purposes such as education and research 
and has been interpreted by courts to per- 
mit at least some TDM uses. Copyright 
laws in many other countries contain ex- 
ceptions for research (or “scientific”) uses 
that can be interpreted to apply to TDM 
uses (4). But only about a fifth of these 
research exceptions are broad enough to 
permit the full range of TDM research, 
which requires the ability to copy, share, 
and analyze whole works in collaboration 
with others (8) (see the figure and table). 
For example, some countries have research 
exceptions that permit uses only of ex- 
cerpts of a work (e.g., Argentina), do not 
apply to uses of books or other kinds of 
works (e.g., most post-Soviet countries), or 
require membership in a specific research 
institute (e.g., Sweden). 

Empirical studies show that copyright 
exceptions for research matter—with corre- 
lations between more permissive research 
exceptions and higher production of cit- 
able works of scholarship (9) and increased 
academic use of TDM methodologies (10). 
But until legal enabling environments for 
TDM research can be harmonized, the full 
benefits from this new research frontier 
will remain inadequately explored. 


LEGALLY ENABLING TDM RESEARCH 

Ideally for researchers, a minimum stan- 
dard for global uses of TDM would be 
implemented everywhere. There are a 
number of avenues that policy-makers can 
take to promote more harmonization of 
copyright exceptions for TDM uses. 
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International treaties 
Nearly all countries provide a high baseline 
level of copyright protection as the result of 
several widely adopted multilateral trea- 
ties, beginning with the Berne Convention 
for the Protection of Literary and Artis- 
tic Works (Berne, 1886) and continuing 
through the WIPO “Internet Treaties” (Ge- 
neva, 1996). A key feature of these treaties 
is a requirement that signatory countries 
impose high standards for protecting 
copyright but leave exceptions, such as 
those permitting research, largely to the 
discretion of national legislatures and 
courts. The result is the fragmented land- 
scape of exceptions shown in the figure. 
But the tide is turning. The WIPO’s Stand- 
ing Committee on Copyright and Related 
Rights is now deliberating over the har- 
monization of exceptions for uses that in- 
clude research. Coalitions of researchers 
and academics are proposing that this 
forum draft a treaty that would permit 
cross-border and other uses of research 
materials to permit TDM everywhere (11). 
There are important precedents for an 
international treaty that imposes uniform 
copyright exceptions around the world. For 
example, WIPO’s last major treaty, the Treaty 


to Facilitate Access to Published Works for 
Persons Who Are Blind, Visually Impaired or 
Otherwise Print Disabled (Marrakesh, 2013), 
harmonized copyright exceptions for people 
with visual impairments. In addition, the 
EU recently enacted an extensive new direc- 
tive including a requirement that national 
copyright laws permit at least some TDM 
research (EU, 2019/790). 


Domestic law reform 

Domestic legislatures can independently 
amend their laws to permit TDM and 
other research uses without any action at 
the international level. Such legal adjust- 
ments are not unprecedented—copyright 
exceptions to permit TDM have recently 
appeared in fair use case law in the US 
(12) and in legislative changes in the EU, 
Singapore, Japan, Switzerland, and the UK 
(8). It is important for a TDM exception to 
apply to uses of all kinds of works (includ- 
ing audiovisual works used in media moni- 
toring, for example) and enable sharing of 
materials at least for the purpose of col- 
laboration and validation. Some scholars 
propose clarifying that all “nonexpressive” 
(not shared publicly) uses of works in TDM 
and other research should be deemed to be 


outside of copyright regulation (73). Japan 
recently implemented this approach into 
its law, adopting an exception from copy- 
right control “where such exploitation is 
not for enjoying or causing another person 
to enjoy the ideas or emotions expressed 
in such work,” such as “in a data analysis” 
(2018, Article 30-4). 

The extension of TDM exceptions to 
commercial uses may be controversial. On 
one hand, many commercial users might 
be capable of paying licensing fees and 
other transaction costs, and copyright ex- 
ceptions that simply transfer wealth from 
copyright owners to commercial TDM us- 
ers, might seem arbitrary and unjustified. 
On the other hand, many socially benefi- 
cial uses of TDM—including the BlueDot 
program that originally tracked COVID-19 
(1) or internet search engines that copy and 
mine the entire internet (3)—would likely 
not exist if commercial uses were excluded 
from copyright exceptions. Some countries 
see commercial TDM as a way to invest in 
domestic innovation and technology trans- 
fer. The EU recently adopted a rule that, al- 
though not fully tested, permits copyright 
holders to opt out of commercial (but not 
“scientific” or “cultural”) TDM uses. 


Research exceptions in copyright laws around the world 


The map shows how 
“open” research 
exceptions are around 
the world to all research uses 
of all works by all kinds of 
researchers [based on (8)]. 
This includes general exceptions for 
research, scholarship, or personal uses 
and specific exceptions for text and 
data mining (TDM) research. Only the 
laws of countries labeled green are 
fully open, thus allowing interpretation 
that these permit all academic TDM 
projects (some countries labeled red 
restrict commercial uses). 


@ Open research © Restrictions on sharing: 


exception: permit reproductions of 
permits all whole works for research 
research uses but do not extend to 
(including communications with others 


sharing) of all 
works by all users 


for purposes such as 
collaboration or validation 


952 2 DECEMBER 2022 + VOL 378 ISSUE 6623 


© Restrictions to 
institutional users: 
provide a research 
exception that applies 
only to institutions, 
such as nonprofit 
libraries 


Restrictions to 
private reproduction: 
provide a research 
exception only 

for use by individuals 
for a “private” 

or “personal” use 


@ Restrictions on @ TDM restricted: 
types of works: confine research 
permit research uses exceptions to short 
of whole works only of excerpts, effectively 
certain kinds, often prohibiting TDM uses 
prohibiting uses of whole 
books and databases #5 Not mapped 
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Another recent TDM exception, enacted 
by Singapore in 2021, offers a model link- 
ing generality and specificity. In addition 
to a broad fair use exception similar to US 
law (2021, Part 5, Division 2), Singapore 
enacted a specific TDM exception for 
“computational data analysis,” including 
the ability to share reproduced works with 
others “for the purpose of (i) verifying the 
results of the computational data analysis” 
and for “(ii) collaborative research or study 
relating to the purpose of the computa- 
tional data analysis” (2021: Part 5, Division 
8). Unfortunately, some TDM exceptions in 
other countries fall short by failing to au- 
thorize the sharing of works in collabora- 
tive research or otherwise restricting the 
full scope of TDM methods (see the table). 


Policy guidance 

Guidance in the interpretation and amend- 
ment of copyright law could help policy- 
makers evaluate their options. Even in 
countries with permissive legislation, 
there is likely to be value in clarifying the 
application of national law to TDM re- 
search. Such guidance can be provided, for 
example, through statements of best prac- 
tices developed by the research community 


in collaboration with legal experts. State- 
ments of best practices in fair use have 
been successful in enabling filmmakers, 
educators, research librarians, and other 
user-creators to confidently use copyright 
materials in their work (J4). 


LIBERATE AND REGULATE 
The resistance to TDM exceptions in 
copyright comes primarily from the mul- 
tinational publishing industry, which is 
a strong voice in copyright debates and 
tends to oppose expansions to copyright 
exceptions. But the success at adopting ex- 
ceptions for TDM research in the US and 
EU already—where publishing lobbies are 
strongest—shows that policy reform in this 
area is possible. Publishers need not be 
unduly disadvantaged by TDM exceptions 
because publishers can still license access 
to their databases, which researchers must 
obtain in the first instance, and can offer 
products that make TDM and other forms 
of research more efficient and effective. 
Policy-makers and the public may fear 
that expanding TDM rights will empower 
technology companies and models of sur- 
veillance capitalism that are under in- 
creasing scrutiny by regulators. But copy- 


TDM exceptions in copyright laws 


The table shows the small number of countries that had enacted specific copyright exceptions for 

text and data mining (TDM) research as of July 2021 [based on (8)]. It applies the same color scheme as 
that in the figure, thus showing that only a small number of countries with TDM exceptions at the time had 
legislated to allow all academic TDM research. Some countries (such as the UK, Switzerland, and various EU 
countries) are different colors in the figure than in this table because the map in the figure identifies the 
most open research exception, including any general exception for a research use, and this table only analyzes 
specific TDM exceptions. The UK, Switzerland, and some EU countries have more open general research 
exceptions than they provide specifically for TDM research (for example, their general research exceptions 
apply to all uses, not only reproduction). Whether one can apply the more general exception above and 
beyond the uses allowed by the TDM exception is a local legal question that this study did not attempt to 
answer. EU DSM Art 3, Article 3 of the EU Directive on Copyright in the Digital Single Market. 


COUNTRY COMMERCIAL USES PERMITTED 


Japan Use 


Singapore Reproduction, 


communication 


Germany 


Reproduction, 
communication, 
storage 


Estonia “processing” 


UK Reproduction 
Switzerland 


EU DSM 
Art 3 


Reproduction 


Reproduction, 
storage 
France Reproduction, 
communication 
(decree) 


Ecuador Safe harbor for liability 


of libraries for TDM “acts 


carried out by their users” 
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USERS WORKS TYPOLOGY 
Open to TDM 


Open to TDM 


"Open to TDM 
(noncommercial)" 


All "Open to TDM 
(noncommercial)" 


All Reproduction only 
All Reproduction only 


Cultural All Cultural institutions 
institutions only 


iN Scientific 
writings 


Limitation 
on works 


Libraries, All 
archives 
(safe harbor) 


Lacks TDM right; 
safe harbor only 


right permission does not trump privacy, 
consumer protection, or other regulation 
of the activity of technology conglomer- 
ates. Countries can liberate TDM research 
and still regulate these other areas. 
Failing to authorize TDM research ev- 
erywhere aggravates harmful disparities 
in our global research system. As shown in 
the figure, the most open regimes for TDM 
research are concentrated in some of the 
wealthiest countries and regions, whereas 
many poorer countries have the most re- 
strictive copyright laws. To ensure that the 
needs of TDM researchers are heard in the 
local and national forums where copyright 
laws can be modified to enable this re- 
search, researchers themselves must speak 
out to voice their concerns and needs. It is 
time that copyright laws around the world 
are adapted to enable TDM research. & 
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SPACE EXPLORATION 


Religion in space 


A sense of divine entitlement pervades private 


space colonization efforts 


By Roger D. Launius 


uman space exploration has always 

been about a quest for utopia, laced 

with a fair measure of religious con- 

ceptions. In the early 1970s, Chris 

Kraft, the godfather of NASA’s Mis- 

sion Control Center and a leader in 
the Apollo Moon landing program, charac- 
terized his support of space explo- 
ration in this way: “This step into 
the universe is a religion and ’m a 
member of it.” 

Kraft’s statement made clear 
that those interested in moving 
beyond Earth, like the migrants 
to the Americas of the 16th and 
17th centuries, endeavored to cre- 


mans, who will bring with them all their 
beliefs and practices, for better and worse. 
Mary-Jane Rubenstein, a professor of 
religion and science in society at Wesleyan 
University, adds her voice to these critiques 
with Astrotopia: The Dangerous Religion of 
the Corporate Space Race. She notes how 
human spaceflight supporters have long 
insisted that space is the next step in hu- 
manity’s “natural” and therefore 
irrepressible need to explore, of- 
ten framing this inclination as a 
spiritual quest, a purification of 
humanity, and a search for absolu- 
tion and immortality. These deep- 
seated convictions, she observes, 
have energized space exploration 
from the dawn of the space age. 


ate a more perfect human expe- Astrotopia ; Many of Rubenstein’s historical 
rience free from the strictures of ™“a!yJane Rubenstein examples are well known. Captain 
University of Chicago 


known society. Of course, what 
constitutes “a more perfect hu- 
man experience” depends very much on 
individual perspective. Many observers of 
the space exploration community recog- 
nize this reality and have provided sting- 
ing critiques about what this might mean 
for extraplanetary regions explored by hu- 
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Kirk’s soliloquy—‘“Space, the final 
frontier’—at the beginning of Star 
Trek and John F. Kennedy’s 1962 speech 
about setting sail on “this new sea” invoked 
journeying to a different land, settling an 
uncolonized region, and creating a new civi- 
lization. Such conceptions conjured images 
of self-reliant people moving to untouched 
territories in sweeping waves of discovery, 
exploration, and settlement. Implied therein 
were utopian ideals of optimism, individual- 
ity, and democracy. But flattening space into 
a mythological frontier reduced the com- 
plexity of events that would transpire during 


Arendering of a proposed space habitat offers an 
idealized view of space settlement untroubled by 
problematic aspects of migration and colonization. 


such exploration to a static morality play, 
avoided matters that challenged or contra- 
dicted the myth, framed the settlement ex- 
perience as inherently good, and ignored the 
cultural context of migration. 

Aspiring space colonizers disappointed 
with NASA’s declension in the 1970s began 
to imagine an alternative agenda aimed at 
achieving a bountiful future on a pristine 
planet, increasingly without government in- 
volvement. “New Space” advocates may be 
thought of as orphans of Apollo who found 
their way into myriad economic, political, 
and social camps. They evince distrust of 
authority, especially governmental author- 
ity, and celebrate the entrepreneurial spirit 
of Elon Musk, Jeff Bezos, and Sir Richard 
Branson, whom they believe will finally 
open a boundless space frontier. Such in- 
dividuals may support NASA's efforts when 
they converge with their own interests, but 
they have grown increasingly critical of the 
space agency and any other large govern- 
mental activities. 

Along with criticisms of NASA, “New 
Space” advocates also accept a dystopian 
future on Earth. They argue that in the 21st 
century, exponential growth of population 
and diminishing resources will create cata- 
clysm. The answer, they believe, is to escape. 
And although humanity does not yet possess 
the technological capability to send human 
colonies elsewhere in the Solar System, these 
obstacles, they maintain, can be overcome. 

Such beliefs are why, as Rubenstein 
makes clear, Musk and Bezos have become 
messiahs for the “New Space” community. 
In building the rockets necessary to get off 
this planet, presumably without government 
sponsorship, these entrepreneurs are open- 
ing the regions beyond Earth to settlement 
as never before. All will be the better for it, 
they believe. Notwithstanding the corporate 
ethos of Musk and Bezos, their supporters 
view their efforts as immensely more accept- 
able than the efforts of NASA. 

Ultimately, Rubenstein succeeds in high- 
lighting both the debate over whether future 
space exploration and exploitation should 
be led by government or entrepreneurial 
entities and the manner in which neoliberal, 
private-sector emphases have come to domi- 
nate the thinking of a particular segment 
of the pro-space community. Her criticisms 
of this phenomenon—part of a growing 
body of literature in environmental studies, 
Afrofuturism, and anticolonialism investiga- 
tions—are on point. 


10.1126/science.adf0791 
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SCIENCE LIVES 


Meet the Huxleys 


A historian traces the scientific family’s complicated 


lives and influential legacies 


By Piers J. Hale 


hat is our evolutionary inheri- 
tance? What has our natural his- 
tory made us, and how does this 
affect how we might live? Fur- 
ther, given our understanding of 
the mechanisms of inheritance, 
can we and should we direct our own evo- 
lution to ensure the betterment of human- 
kind, both in body and in mind? Questions 
such as these are the intellectual offspring 
of the evolutionary view of life that 
Charles Darwin described in 1859. 
They also motivated the Huxleys, 
one of science’s most famed fami- 
lies and the subject of Alison Bash- 
ford’s new book, The Huzleys. 

Bashford’s book centers on 
Thomas Henry Huxley (1825- 
1895), Darwin’s great champion, 
and his grandson Julian Sorell 
Huxley (1887-1975), author of, 
among much else, Evolution: The 
Modern Synthesis, an influential 
text that popularized evolutionary 
biology in the mid-20th century 
(1). The Huzleys spans the publica- 
tion and popularization of On the 
Origin of Species, the subsequent 
eclipse of natural selection as the 
primary factor in the forging of 
new species in favor of Mendel’s 
rediscovered mutationist theory of 
heredity, and the later recognition 
that selection might be the decid- 
ing factor in the sorting out and 
perpetuation of gene frequency in 
a population. 

For the Huxleys, biology was al- 
ways political, but it was also full 
of contradictions. Thomas Henry’s §hier- 
archical conceptions of race and sex were 
still liberal enough and typical enough of 
his time that he could oppose the more ex- 
treme racism of many of his contemporaries 
in British anthropology and advance sex- 
segregated science education for women 
seemingly without internal conflict. 

Julian, meanwhile, defended eugenics 
while declaiming Nazism and fascism; he 
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had sympathy for socialism but rejected 
Lysenko’s opposition to “bourgeois” genetics. 
In the 1940s, he used his position as direc- 
tor of UNESCO to advance his doctrine of 
“evolutionary humanism,’ through which he 
sought to employ eugenics and population 
control as a means to human betterment. 

In Bashford’s account, biology and 


politics are but the warp and weft of what 
must be one of the most compelling and 
tragic multigenerational scientific lega- 
cies. For while Thomas Henry, Julian, and 


Julian Huxley (left) and his grandfather Thomas Henry Huxley (portrait 


on right) were key voices in early evolutionary thought. 


their many achievements are in the story’s 
foreground, it is the lives, loves, and losses 
that wracked this family that underlie 
Bashford’s tale. 

Despite how Thomas Henry embraced 
and advanced Darwin’s views on moral- 
ity—which shocked critics by discard- 
ing the notion of divinely ordained prin- 
ciples of right and wrong—he strove 
to place himself and his family beyond 
reproach, adhering to the most conven- 
tional moral standards of his day. His 
son Leonard’s children, however, broke 
rather than pushed societal boundaries. 


The Huxleys: 

An Intimate History of 
Evolution 

Alison Bashford 

University of Chicago Press, 
2022. 576 pp. 


Julian and his wife experimented with 
an open relationship; he had a stream of 
lovers, at least one of whom she shared, 
and wrote popular articles on birth con- 
trol, sex, and the future of humanity for 
Playboy. Aldous, for his part, married “a 
Belgian bisexual beauty,’ experimented 
with LSD, and sought to test the nature 
and limits of the mind. He wrote Brave 
New World in 1932 as Europe tumbled to- 
ward authoritarianism, an inclination he 
recognized in the enthusiasm his brother 

Julian shared with H. G. Wells for 
technocratic state solutions to so- 
ciety’s problems. 

Along with his surname, many 
of the Huxleys inherited the family 
patriarch’s tendency toward deep 
depression. Thomas Henry wrestled 
with this darkness for years, and it 

__ almost overthrew him on more than 

| one occasion: once at the loss of his 

firstborn, Noel, who succumbed 
to scarlet fever at 4 years old, and 
again at the death of his daughter 
Mady, who, struggling with her own 
mental demons, died suddenly of 

| pneumonia at the age of 27. 

i Julian, we learn, was in and out 
of institutions just as his career 
was taking off. Unable to func- 
tion, he was forbidden by his doc- 
tors even to write, just as he was 

_. supposed to be taking up a pro- 
fessorship at Rice University. Noel 
Trevenen, Julian’s younger brother, 
was similarly troubled. Missing for 
days on the eve of the Great War, 
he was later found hanging from a 
tree in a secluded forest. 

Bashford tells the story of these 
intertwined lives with sympathy and can- 
dor but also with dexterity, as she weaves 
together themes not beholden to a linear 
chronology. Readers follow the Huxleys 
as they contemplate nonhuman animals, 
primates, man, and mind in their intergen- 
erational quest to understand the impli- 
cations of evolution on what it means, or 
might mean, to be human. 
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Guizhou snub-nosed 
monkey in peril 


The Guizhou (or gray) snub-nosed mon- 
key (Rhinopithecus brelichi), is endemic 

to Guizhou province in southwestern 
China. Because of the species’ restricted 
distribution and small population (J), the 
Fanjingshan National Nature Reserve was 
established to protect it in 1978 (2), and it 
was categorized as China’s National Class I 
key protected wildlife in 1989 (3). However, 
anthropogenic disturbances such as defor- 
estation, farming, livestock grazing, and 
tourism have continued, degrading the 
monkey’s habitat (2, 4). Without immediate 
action, the Guizhou snub-nosed monkey 
could go extinct. 

In 2008, there were about 750 Guizhou 
snub-nosed monkeys surviving in the wild 
(2). In 2009, the construction of an aerial 
tram divided the monkey’s only remaining 
habitat into two parts, and prevented the 
population from accessing the southern 
region. Confined to a small fraction of the 
Fanjingshan National Nature Reserve, the 
species’ estimated population decreased 
to between 125 and 336 individuals (4, 5), 
with low genetic diversity (6). In response, 
the International Union for Conservation 
of Nature categorized the monkey as 
Critically Endangered (5) and one of the 
world’s most endangered primates (7). 

To protect the Guizhou snub-nosed 
monkey from extinction, tourism develop- 
ment in the Fanjingshan reserve should be 
halted immediately, and tourist numbers 
should be strictly limited. An ecological 
corridor should be created by dismantling 
the aerial tram and replanting evergreen 
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and deciduous broadleaf mixed forest trees 
between the two habitats. Finally, China’s 
government should attempt to translocate 
some individuals to other suitable places 
to establish new populations outside of the 
present reserve. 
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Combatting national 
research restrictions 


In her Science Insider piece “Indonesia 
bans five foreign scientists, shelves con- 
servation data” (7 October, https://scim. 
ag/up), D. Rochmyaningsih describes how 
the Indonesian government is suppressing 
conservation scientists, research, and data 
in pursuit of economic development. Other 
governments have also restricted domestic 
and international research on politically 
sensitive topics, sometimes indirectly by 
controlling data (J), restricting funding 
(2), denying access to research sites (3) or 
specimens (4, 5), or labelling researchers 


Tourism development 

has fragmented the 

habitat of the Guizhou 
snub-nosed monkey. — 


as disruptive protesters, foreign spies, or 
provocateurs (6-8). To address global envi- 
ronmental challenges, researchers need 
freedom to collect, analyze, and share data 
without political constraints. International 
treaties and organizations could counteract 
some of these restrictions. 

At its 15th Conference of the Parties in 
December, the Convention on Biological 
Diversity’s (CBD) 196 signatory nations 
(9) will adopt a protocol formalizing its 
new agreements. To address restrictions 
imposed on researchers, the CBD could 
include a requirement for all signatory 
nations to allow unimpeded access to their 
territories by biodiversity researchers from 
all nations, including their own. Countries 
could enforce compliance through sanc- 
tions against exports linked to biodiversity 
loss, including food, forest, and mineral 
products. Trade sanctions are limited by 
the World Trade Organization (WTO), but a 
CBD research-freedom protocol would com- 
ply with WTO related-sector, proportional- 
ity, and environmental rules (10). 

In addition to a CBD protocol, interna- 
tional entities could withhold benefits from 
countries that deny access to researchers. In 
the case of Indonesia, the nonprofit Forest 
Stewardship Council could decertify the 
country (17) from its timber ecolabel, used 
widely by retailers in the European Union 
and United Kingdom. Corporations could 
also penalize countries for restrictions on 
scientists; airlines that ban the transporta- 
tion of hunting trophies from threatened 
species (72) could serve as a model. Large 
importing nations can implement sanc- 
tions as well. For example, the United States 
greatly reduced global harvest of the endan- 
gered hawksbill turtle by placing an import 
embargo on Japanese fish products (J0). 

International cooperation and sanctions 
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could substantially improve research- 

ers’ access to sites and reduce politically 
charged targeting of scientists. Together, 
CBD nations and global organizations have 
the legal and political power to demand 
reduced impediments to research on biodi- 
versity conservation. 
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Biological invasions in 
China’s coastal zone 


Over the past few decades, China’s coastal 
zone has been subject to extensive land 
reclamation. As of 2014, approximately 
65% of all coastlines in China were 
affected by the expansion of farmlands, 
salt pans, aquaculture ponds, roads, and 
buildings (7), resulting in disrupted eco- 
system services, ecological security, and 
sustainability (2-4). About a decade ago, 
the Chinese government began address- 
ing the problem (2, 5). On 1 June, a new 
Wetland Protection Law came into effect, 
which is expected to build on past suc- 
cess (6). However, invasive species now 
threaten China’s coastal wetlands. 
Spartina alterniflora, a perennial tall- 
grass native to North America, was inten- 
tionally introduced in China in 1979 for soil 
amelioration, tidal reclamation, and ero- 
sion mitigation (7). The species expanded 
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rapidly from 0.001 km? in 1981 to about 
550 km? in 2015 (7, 8). S. alterniflora has 
invaded several protected areas of coastal 
wetlands, including a world heritage site 
of migratory bird sanctuaries along the 
Yellow Sea coast and Bohai Gulf of China 
(9). The S. alterniflora saltmarshes prevent 
shorebirds from finding prey on mudflats 
and reduce coastal biodiversity (7, 8). 
Given the projected global climate 
change, S. alterniflora is likely to continue 
to spread unabated in China. Warming 
could enhance its growth and reproduction 
and facilitate its spread to high latitudes. 
Sea level rise could increase its competitive 
advantage over native plants (7, 10). 
Although traditional coastal reclama- 
tion for development has been controlled, 
the invasion and expansion of S. alterni- 
flora pose a serious threat to the quality 
and sustainability of China’s coastal wet- 
lands. The control of S. alterniflora and 
the restoration of native saltmarshes are 
desperately needed to maintain the habi- 
tat quality and sustainability of China’s 
coastal zone (17). Given the high dispersal 
ability and rapid reinvasion potential of S. 
alterniflora, national coordinated actions 
and funding mechanisms are urgently 
needed to effectively control S. alterni- 
flora invasions at multiple temporal and 
spatial scales. 
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AIR POLLUTION 


Pollution harms children’s development 


he harmful effects of neighborhood poverty on children's 
cognitive development can be partly explained by early 
exposure to toxic air pollution. Wodtke et al. analyzed data 
from a national sample of American infants matched with 
information about their exposure to dozens of potentially 
harmful pollutants. Using causal inference and machine-learning 
methods, they found that infants who lived in high-poverty 


MACHINE LEARNING 
Machine learning to play 
Stratego 


Stratego is a popular two-player 
imperfect information board 
game. Because of its complexity 
stemming from its enormous 
game tree, decision-making 
under imperfect information, 
and a piece deployment phase 
at the start, Stratego poses a 
challenge for artificial intel- 
ligence (Al). Previous computer 
programs only performed at an 
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amateur level at best. Perolat et 
al. introduce a model-free mul- 
tiagent reinforcement learning 
methodology and show that it 
can achieve human expert-level 
performance in Stratego. The 
present work not only adds to 
the growing list of games that Al 
systems can play as well or even 
better than humans but may 
also facilitate further applica- 
tions of reinforcement learning 
methods in real-world, large- 
scale multiagent problems that 
are characterized by imperfect 


Air pollution in Chicago, pictured here, is not 
evenly distributed and has disproportionate 
effects on children in poor communities. 


~ 


neighborhoods were exposed to many different pollutants and 
that this exposure was linked to impaired cognitive ability when 
the children were tested at age 4. These findings demonstrate 
how neighborhood poverty consists not only of economic depri- 
vation but also of environmental health hazards that contribute to 
the reproduction of poverty across generations. —JEB 


Sci. Adv. 10.1126/sciadv.add0285 (2022). 


information and thus are cur- 
rently unsolvable. —YS 
Science, add4679, this issue p. 990 


PLANT SCIENCE 
Nitrogen fixation kept 
in balance 


Legumes such as soybeans can 
assimilate atmospheric nitrogen 
with the assistance of symbiotic 
bacteria that reside in root nod- 
ules. Nitrogen fixation is energy 
intensive, and the cellular energy 
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status in the nodule determines 
the rate of nitrogen fixation. Ke 
et al. investigated that linkage 
using molecular sensors in the 
root nodule that respond to AMP 
concentrations. The sensors 
regulate access of a nuclear 
regulatory factor to certain 
glycolytic genes, thus shifting 
allocation of phosphoenol- 
pyruvate between competing 
pathways to drive up nitrogen 
fixation when the nodule has 
enough energy. —PJH 

Science, abq8591, this issue p. 971 
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ULTRACOLD CHEMISTRY 
Forming dense, ultracold 
triatomic gases 


After the successful creation of 
ultracold diatomic molecular 
gases, the next grand experi- 
mental challenge is to prepare 
and control ultracold gases of 
triatomic molecules. Because 
they have more degrees of 
freedom, these gases offer 
many exciting research oppor- 
tunities for molecular precision 
spectroscopy and quantum 
simulations. Using adiabatic 
magneto-association by ramp- 
ing the magnetic field through 
a Feshbach resonance between 
*Na*K and 4K, Yang et al. cre- 
ated an ultracold gas of weakly 
bound *Na*°K, triatomic mol- 
ecules at about 100 nanokelvin, 
as confirmed by their direct 
detection using the radiofre- 
quency dissociation. About 
4000 triatomic molecules were 
created with a high peak density 
10 orders of magnitude higher 
than the best results previ- 
ously reported, representing an 
important milestone in ultracold 
chemistry and physics. —YS 
Science, ade6307, this issue p. 1009 


MONKEYPOX 
Monkeypox antiviral drug 


The antiviral drug tecovirimat 
(TPOXX) was developed to treat 
smallpox, which is caused by 
infection with variola virus, an 
orthopoxvirus closely related to 
monkeypox virus (MPXV). During 
the current global MXPV out- 
break, TPOXX has been approved 
for treating infected individuals 

in Europe and has been used in 
other countries where available. 
Warner et al. now demonstrate 
that TPOXX is effective against 

a 2022 Canadian MPXV isolate 
both in vitro and in the CAST/EiJ 
mouse model. This viral isolate 

is not as lethal in CAST/EiJ mice 
as previous isolates, despite 
evidence of respiratory tract 
replication, and it appears to align 
with other recent isolates that 
define a new viral clade. Together, 
these findings support the 
therapeutic use of TPOXX against 
currently circulating MPXV and 
suggest changes in the virus that 
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alter pathogenicity in mouse 
models. —CNF 
Sci. Transl. Med. 14, eade7646 (2022). 


QUANTUM SIMULATION 
Topological quantum 


photonics 
Exploiting the topological prop- 
erties of materials is expected 
to provide a route to developing 
robust platforms for transport 
and communication systems 
that are immune to defects. In 
optics, the demonstration of 
topological behavior has been 
confined mainly to classical light. 
Deng et al. introduce a super- 
conducting chip—based platform 
consisting of a single qubit cou- 
pled to a number of resonators. 
By controlling the photon popu- 
lation in each resonator and the 
coupling strength, the authors 
were able to realize several 
important models in topological 
physics. The approach bridges 
the gap between topological 
states of classical and quantum 
origin. —ISO 

Science, ade6219, this issue p. 966 


QUALITY CONTROL 
Weeding out faulty 


membrane proteins 
Membrane proteins connect the 
cell to its environment and are 
essential for key biological activi- 
ties and cell survival. To function 
properly, membrane proteins 
need to adopt a well-defined 
three-dimensional structure 
within the lipid bilayer. Failures in 
this process give rise to numer- 
ous diseases. Zanotti et al. found 
that the human signal peptidase 
complex, which removes signal 
peptides from endoplasmic 
reticulum-—targeted secretory 
and membrane proteins, has an 
additional quality control func- 
tion. The complex cleaves faulty 
membrane proteins, supporting 
their degradation and helping 
to maintain a healthy mem- 
brane proteome. These findings 
extend our understanding of 
molecular quality control in cells 
and suggest potential targets in 
protein-folding disorders. —SMH 
Science, abo5672, this issue p. 996 
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DEVELOPMENT 


Self-organizing feather patterns 


elf-organizing biological systems can 
produce an array of patterns ranging from 
meanders and waves to stripes or spots. 
However, small variations in starting condi- 
tions or signaling parameters can produce 
wildly different patterns. Curantz et al. investi- 
gated how bird plumage patterns are stabilized 
in the developing skin. In the developing dermal 
cells of zebra finch, penguin, and quail, feather fol- 
licle patterns emerged early and remained stable, 
in contrast to emu and ostrich, in which follicle 
patterns were loosely organized. It appears that a 
combination of cell shape and skin flexibility influ- 
ences follicle movement during development and 
thus how defined or irregular the resulting feather 


patterns become. —PJH 
PLOS Biol. 20,e3001807 (2022). 


MICROBIOTA 
Metabolic 


reconstructionism 


A major function of the ver- 
tebrate gut microbiota is the 
fermentation of foodstuffs to 
extract energy for each part- 
ner’s benefit. Earlier literature 
informed Hoces et al. that mice 
lacking a microbiota tended 

to be less fat than their intact 
counterparts. The authors built 
an isolator-housed metabolic 
cage system to monitor the 
metabolism of conventional 
mice, germ-free mice, and mice 
hosting a model microbiota 

of 12 bacterial lineages called 
OligoMM12. All groups extracted 
about 9 kilocalories per day from 
their food, but the OligoMM12 
mice became fatter. The germ- 
free mice ate more and excreted 
more than the other groups, 

but their feces were less energy 
dense because they lacked 
bacterial mass. The OligoMM12 
community did not recapitulate 
the activity of an intact micro- 
biota, but if microbiota species 
are evolutionarily selected for 


their contribution to host energy, 
OligoMM12 could be modified to 
test this idea. —CA 
PLOS Biol. 10.1372/journal. 
pbio.3001743 (2022). 


DEVELOPMENT 
Two faces of 
differentiation 


Proper cell differentiation 

not only requires the correct 
signals but also needs to avoid 
confusion with signals from 
neighboring cells. For instance, 
anterior intestine cells and liver 
cells arise from similar nearby 
progenitor cells, so how do 
intestinal cell precursors prevent 
induction by signals from liver 
cells? Yang et al. used a genetic 
screen in zebrafish to identify a 
mutation in the Cdxlb homeo- 
protein that induces intestinal 
cells to become liver cells. The 
Cdx1b transcription factor 
regulates the expression of an 
inhibitor of Wnt signaling called 
secreted frizzled related protein 
5, which is a cue for liver dif- 
ferentiation. Therefore, intestinal 
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precursor cells do not rely solely 
on inductive signals; they are 
also able to resist neighboring 
signals to ensure correct differ- 
entiation. —LBR 
Proc. Natl. Acad. Sci. U.S.A. 119, 
e2205110119 (2022). 


MIC OLOGY 

Keeping TB in the lung 
Bacterial DNA from 
Mycobacterium tuberculosis has 
been detected in the skeletons of 
ancient Egyptian mummies. This 
means that although tuberculosis 
(TB) is largely a lung disease, M. 
tuberculosis can sometimes dis- 
seminate to bone. Studying a TB 
outbreak with unusually high lev- 
els of skeletal disease, Saelens et 
al. found that the presence of an 
ancestral version of a specialized 
bacterial secreted protein called 
EsxM likely helped to define 

the clinical course of infection. 
The authors also found that the 
ancestral version of EsxM rewired 
infected host macrophages to 
become more migratory, which 
promoted the dissemination of 
infection. M. tuberculosis strains 
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from modern lineages that are 
broadly distributed geographi- 
cally contain an inactivating 
mutation in EsxM, likely limiting 
the extent of extrapulmonary 
dissemination and promoting 
transmission. —-SMH 


Cell 185, 1 (2022). 


An ancient lineage of the tuberculosis 


bacterium can still cause the 

type of skeletal disease seen in this 
mummy, which dates from 21st 
dynasty Egypt. 


Australian zebra finches, 
Taeniopygia castanotis, 

have well-defined plumage 
patterns because of 

their relatively inflexible skin. 


SYNTHETIC CI 


User-friendly reaction 
optimization tool 


Recent advances in solv- 

ing various challenges in 
synthetic chemistry using 
data science and machine 
learning (ML) methods have 
generated considerable inter- 
est in the development of 
multifunctional ML platforms 
that could simultaneously 
optimize complex multiple 
reaction objectives and remain 
comprehensible and useful 

to synthetic chemists with 
limited programming experi- 
ence. Garrido Torres et al. 
developed an open-source, 
multi-objective, active-learning 
platform based on Bayesian 
optimization methods. They 
demonstrated its successful 
application in a real-world test 
case: the simultaneous optimi- 
zation of the reaction yield and 
enantioselectivity for a nickel/ 
photoredox-catalyzed enan- 
tioselective cross-electrophile 
coupling of styrene oxides 

with two different aryl iodide 


substrates. The proposed 
platform is web accessible and 
could be expanded to numer- 
ous optimization problems in 
synthetic chemistry. —YS 

J.Am. Chem. Soc. 144, 19999 (2022). 


RICULTURE 


Crop innovation as 
climate adaptation 


Innovations in crop biotechnol- 
ogy focused on environmental 
adaptation have mitigated an 
estimated 20% of potential 
economic damages to US 
agriculture caused by climate 
change-induced increases 
in extreme heat exposure. In 
addition to the $24 billion in 
avoided historical damages 
since 1960, Moscona and 
Sastry estimate that similar 
redirection of crop innovation, 
more specifically by the private 
sector than the public, could 
mitigate 13% ($1.05 trillion) of 
damages projected by 2100. 
However, they also found that 
private sector—driven his- 
torical innovation has been 
unresponsive to agricultural 
conditions beyond the United 
States, a considerable limita- 
tion for addressing threats from 
climate change in other parts of 
the world. —BW 
Q. J. Econ. 10.1093/ 
qje/qjac039 (2022). 


METALLUR 


Getting the carbon out 
Recycling iron alloys is 
important for sustainable 
development because of the 
high costs of refining iron ores. 
However, controlling impurities 
in the recycling process pres- 
ents a challenge for producing 
high-quality alloys. Judge et al. 
used electrorefining to produce 
ultra-low-carbon steels from 
recycled metal. The technique 
avoids using dissolved oxygen, 
instead relying on a reaction 
at the slag interface to evolve 
carbon monoxide. The strategy 
provides a more direct method 
for producing high-quality 
steels for structural applica- 
tions. —BG 

Nat. Mater. 21, 1130 (2022). 
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HIV CLINICAL TRIALS 
Human vaccination 
induces bnAb precursors 


Each year, more than 1 mil- 
ion new HIV infections occur, 
highlighting the need for 
effective HIV vaccines. Vaccine 
strategies that induce broadly 
neutralizing antibodies (bnAbs) 
have promise to combat HIV 
and other pathogens but have 
not yet been tested in humans. 
Leggat et al. report the results of 
a phase 1 clinical trial showing 
that a germline-targeting prim- 
ing immunogen was safe and 
feasible and induced targeted 
bnAb-precursor responses in 
97% of vaccine recipients at 
substantial frequencies in each 
individual (see the Perspective 
by Moore). bnAb-precursor 
responses made favorable gains 
in mutation and affinity after a 
booster vaccination. The results 
establish proof of principle 
for this reductionist vaccine 
approach and encourage the 
development of additional boost- 
ers to induce bnAbs. —PNK 
Science, add6502, this issue p. 964; 
see also adf3722, p.949 


METALLURGY 
Too cold to fracture 


Finding structural materials that 
have good fracture properties 
at very low temperatures is 
challenging but is important for 
fields such as space exploration. 
Liu et al. discovered a high- 
entropy chromium-cobalt-nickel 
alloy that has an incredibly high 
fracture toughness at 20 kelvin 
(see the Perspective by Zhang 
and Zhang). This behavior is 
caused by an unexpected phase 
transformation that, when 
combined with other microstruc- 
tures, prevents crack formation 
and propagation. The fracture 
toughness of this alloy makes it 
potentially useful for a range of 
cryogenic applications. —BG 
Science, abp8070, this issue p. 978; 
see also adf2205, p. 947 
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BIODIVERSITY 
Protecting Madagascar 


Madagascar has been isolated 
from mainland Africa and Asia 
for more than 80 million years 
and has developed a distinctive 
flora and fauna, with more than 
90% of its species endemic to 
the island nation. It is also home 
to the Malagasy people, with a 
population of about 30 mil- 
lion, and was first colonized by 
humans around the first century 
BCE. The island's biodiverse 
wildlife is highly threatened, and 
much of its human population 
lives below the poverty line. In 
Reviews, Antonelli et al. and 
Ralimanana et al. characterize 
the biological history and diver- 
sity of the island and examine 
conservation status and actions 
required to protect biodiversity 
and improve living standards 
and well-being for the Malagasy 
people. —SNV 

Science, abf0869, adf1466, 

this issue p. 962, 963 


VOLCANOLOGY 
Picturing Yellowstone’s 
plumbing 


Yellowstone is an active super- 
volcano that will cause mass 
destruction when it next erupts. 
Maguire et al. use full waveform 
seismic imaging to map the loca- 
tion and amount of melt under 
the volcano (see the Perspective 
by Cooper). They find the largest 
amount of melt is roughly in 
the depth range where previ- 
ous eruptions were sourced. 
However, the amount of melt is 
much lower than required for a 
massive eruption anytime in the 
near future. Continued monitor- 
ing of the subsurface should 
provide a clear picture if the 
situation begins to dramatically 
change. —BG 

Science, ade0347, this issue p. 1001; 

see also ade8435, p. 945 
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MOLECULAR BIOLOGY 
Active DNA demethylation 
in neurons 


The DNA of neurons is continu- 
ally damaged due to lifelong, 
high-level metabolic and 
transcriptional activity. Recent 
studies have also demonstrated 
extensive “programmed” DNA 
damage in differentiating 
postmitotic neurons. Wang et al. 
identified endogenous lesions as 
single-strand-break intermedi- 
ates of thymine DNA glycosylase 
(TDG)—mediated removal of 
oxidized methylcytosines during 
active DNA demethylation (see 
the Perspective by Lopez- 
Moyado and Rao). Interrupting 
active DNA demethylation using 
antineoplastic cytosine analogs 
triggered TDG-dependent 
neuronal cell death. This work 
suggests that the well-known 
neurotoxic side effects of certain 
chemotherapies, also called 
“chemobrain,” could be linked to 
DNA repair processes intrinsic 
to normal neuronal differentia- 
tion. —DJ 

Science, add9838, this issue p.983; 

see also adf3171, p.948 


WILDFIRES 


High and dry 


High temperatures and dry con- 
ditions have produced extreme 
fire activity in northeastern 
Siberia recently. Scholten et al. 
report that spring and summer 
weather conditions from 2019 
to 2021, driven by concurrent 
changes in the frequency of 
Arctic front jets and seasonally 
earlier snowmelt, resulted in 
unusually intense fire activity 
there and a northward shift of 
fires (see the Perspective by 
Schaepman-Strub and Kim). 
In the future, these trends 
could accelerate the degrada- 
tion of carbon-rich permafrost 
peatlands and contribute even 
more than they do now to global 
warming. —HJS 

Science, abn4419, this issue p.1005; 

see also ade8673, p. 944 


PHYSIOLOGY 
Limiting nutrient loss 
in the kidney 


The kidney filters the blood and 
retains nutrients through endo- 
cytosis and active transport in 
cells lining the proximal tubule. 
Rinschen et al. investigated how 
this process is regulated by 
VPS34, a lipid kinase involved 
in the vesicular trafficking and 
endocytic sorting of membrane 
proteins. VPS34 deficiency in 
proximal tubule cells in mice 
decreased the surface levels 
of nutrient transporters, which 
enhanced urinary loss of lipids 
and proteins. VPS34 inhibition 
could be used to treat diseases 
in which limiting nutrient loss 
confers clinical benefit, such as 
kidney cancer. -WW 

Sci. Signal. 15, eabo7940 (2022). 


T CELLS 
Activation answers 


T cell activation requires 
changes in metabolism needed 
for the energy demands of 
rapid growth and proliferation. 
Cytokines that engage common 
gamma chain (cy) receptors on 
T cells are critical to promoting 
the metabolic changes needed 
for activation. Villarino et al. 
examined the role of STATS 
engagement, which is a signal- 
ing pathway shared by all cy 
cytokines. STAT5 was defined 
as amaster regulator of amino 
acid metabolism in CD4* helper 
T cells through interactions 
with enhancers and promoters 
of genes encoding a wide array 
of enzymes and transporters. 
STAT5 controlled transcription of 
members of the mTOR pathway 
to license T cells for interleu- 
kin-2—mediated mTOR signaling 
and promoted MYC-driven 
metabolic changes. Together, 
these findings provide molecular 
insights downstream of inter- 
leukin-2 engagement that are 
critical to T cell activation. —-CNF 
Sci. Immunol. 7, eabl9467 (2022). 
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ARCHAEOLOGY 
Bronze Age trade 
networks of tin 


Bronze, which is composed 

of copper and tin, was a key 
commodity of the ancient world 
during the second millennium 
BCE and was used for weapons 
and everyday tools. The copper 
sources for bronze have long 
been known, but those of tin 
have never been identified with 
certainty. In a study of 105 tin 
ingots from the extraordinary 
shipwreck site of Uluburun off 
Turkey's southern coast, Powell 
et al. found that one-third of 
them had their origin in Tajikistan 
and Uzbekistan, more than 3000 
kilometers away. These findings 
suggest that trade at multiple 
scales, ranging from households 
to political elites, contributed 

to a vast network, challenging 
previous models of centralized 
control of commodity trade in 
this era. -MSA 


Sci. Adv. 10.1126/ 
sciadv.abq3766 (2022). 


SYNTHETIC BIOLOGY 
Tracing a history 


of cell-cell contact 


The ability to monitor the history 
of cell-cell contact could benefit 
our understanding of cellular 
interactions in many biological 
contexts, from development to 
various disease states. Zhang et 
al. modified synthetic receptor 
systems that could be targeted 
to cells of interest in mice. 
Binding of the designed ligand 
ona sender cell to a compatible 
receptor on a receiver cell per- 
manently marked the receiver 
cell through a change in expres- 
sion of a fluorescent reporter. 
For example, they could detect 
endothelial cells that moved dur- 
ing development from the heart 
to the liver. -LBR 

Science, abo5503, this issue p. 965 
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REVIEW SUMMARY 


BIODIVERSITY 


Madagascar’s extraordinary biodiversity: 


Evolution, distribution, and 


Alexandre Antonelli* et al. 


BACKGROUND: The Republic of Madagascar is 
home to a unique assemblage of taxa and a 
diverse set of ecosystems. These high levels of 
diversity have arisen over millions of years 
through complex processes of speciation and 
extinction. Understanding this extraordinary 
diversity is crucial for highlighting its global 
importance and guiding urgent conservation 
efforts. However, despite the detailed knowledge 
that exists on some taxonomic groups, there are 
large knowledge gaps that remain to be filled. 


use 


ADVANCES: Our comprehensive analysis of 
major taxonomic groups in Madagascar sum- 
marizes information on the origin and evolu- 
tion of terrestrial and freshwater biota, current 
species richness and endemism, and the utiliza- 
tion of this biodiversity by humans. The depth 
and breadth of Madagascar’s biodiversity— 
the product of millions of years of evolution 
in relative isolation —is still being uncov- 
ered. We report a recent acceleration in the 
scientific description of species but many 
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Emergence and composition of Madagascar’s extraordinary biodiversity. Madagascar’s biota is the 


result of over 160 million years of evolution, mostly in 


geographic isolation, combined with sporadic long 


distance immigration events and local extinctions. (Left) We show the age of the oldest endemic Malagasy 
clade for major groups (from bottom to top): arthropods, bony fishes, reptiles, flatworms, birds, amphibians, 
flowering plants, mammals, non-flowering vascular plants, and mollusks). Humans arrived recently, some 


10,000 to 2000 years (top right) and have directly or 


indirectly caused multiple extinctions (including 


hippopotamus, elephant birds, giant tortoises, and giant lemurs) and introduced many new species (such 
as dogs, zebu, rats, African bushpigs, goats, sheep, rice). Endemism is extremely high and unevenly distributed 
across the island (the heat map depicts Malagasy palm diversity, a group characteristic of the diverse humid 
forest). Human use of biodiversity is widespread, including 1916 plant species with reported uses. The scientific 
description of Malagasy biodiversity has accelerated greatly in recent years (bottom right), yet the diversity 
and evolution of many groups remain practically unknown, and many discoveries await. 
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remain relatively unknown, particularly fungi 
and most invertebrates. 


DIGITIZATION: Digitization efforts are already 
increasing the resolution of species richness 
patterns and we highlight the crucial role of 
field- and collections-based research for ad- 
vancing biodiversity knowledge in Madagascar. 
Phylogenetic diversity patterns mirror that of 
species richness and endemism in most of the 
analyzed groups. Among the new data presented, 
our update on plant numbers estimates 11,516 
described vascular plant species native to Mad- 
agascar, of which 82% are endemic, in addition to 
1215 bryophyte species, of which 28% are endemic. 
Humid forests are highlighted as centers of di- 
versity because of their role as refugia and centers 
of recent and rapid radiations, but the distinct 
endemism of other areas such as the grassland- 
woodland mosaic of the Central Highlands and 
the spiny forest of the southwest is also impor- 
tant despite lower species richness. Endemism 
in Malagasy fungi remains poorly known given 
the lack of data on the total diversity and global 
distribution of species. However, our analysis 
has shown that ~75% of the fungal species de- 
tected by environmental sequencing have not 
been reported as occurring outside of Madagascar. 
Among the 1314 species of native terrestrial 
and freshwater vertebrates, levels of endemism 
are extremely high (90% overall)—all native 
nonflying terrestrial mammals and native 
amphibians are found nowhere else on Earth; 
further, 56% of the island’s birds, 81% of 
freshwater fishes, 95% of mammals, and 
98% of reptile species are endemic. Little is 
known about endemism in insects, but data 
from the few well-studied groups on the island 
suggest that it is similarly high. The uses of 
Malagasy species are many, with much po- 
tential for the uncovering of useful traits for 
food, medicine, and climate mitigation. 


OUTLOOK: Considerable work remains to be 
done to fully characterize Madagascar’s bio- 
diversity and evolutionary history. The multi- 
tudes of known and potential uses of Malagasy 
species reported here, in conjunction with the 
inherent value of this unique and biodiverse 
region, reinforce the importance of conserving 
this unique biota in the face of major threats 
such as habitat loss and overexploitation. The 
gathering and analysis of data on Madagascar’s 
remarkable biota must continue and accelerate 
if we are to safeguard this unique and highly 
threatened subset of Earth’s biodiversity. 
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in the full article online. 
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Evolution, distribution, and use 
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Madagascar’s biota is hyperdiverse and includes exceptional levels of endemicity. We review the 
current state of knowledge on Madagascar’s past and current terrestrial and freshwater biodiversity 
by compiling and presenting comprehensive data on species diversity, endemism, and rates of 
species description and human uses, in addition to presenting an updated and simplified map of 
vegetation types. We report a substantial increase of records and species new to science in recent 
years; however, the diversity and evolution of many groups remain practically unknown (e.g., fungi 
and most invertebrates). Digitization efforts are increasing the resolution of species richness 
patterns and we highlight the crucial role of field- and collections-based research for advancing 
biodiversity knowledge and identifying gaps in our understanding, particularly as species richness 
corresponds closely to collection effort. Phylogenetic diversity patterns mirror that of species 
richness and endemism in most of the analyzed groups. We highlight humid forests as centers of 
diversity and endemism because of their role as refugia and centers of recent and rapid radiations. 
However, the distinct endemism of other areas, such as the grassland-woodland mosaic of the 
Central Highlands and the spiny forest of the southwest, is also biologically important despite lower 
species richness. The documented uses of Malagasy biodiversity are manifold, with much potential 
for the uncovering of new useful traits for food, medicine, and climate mitigation. The data 
presented here showcase Madagascar as a unique “living laboratory” for our understanding of 
evolution and the complex interactions between people and nature. The gathering and analysis of 
biodiversity data must continue and accelerate if we are to fully understand and safeguard this 
unique subset of Earth’s biodiversity. 


he Republic of Madagascar, an island 
country off the east coast of Africa, is 
home to a unique assemblage of taxa 
and a diverse set of ecosystems. The high 
levels of terrestrial and freshwater diver- 
sity have arisen over millions of years through 
complex processes of speciation and extinc- 
tion. Understanding the origins, evolution, 
current distribution, and uses of this extraor- 
dinary diversity is crucial to highlighting its 
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global importance and guiding urgent conser- 
vation efforts (J, 2). 


Origins of Madagascar’s biota 


Once part of the Gondwana supercontinent, 
Madagascar and India split from Africa 150 
to 160 million years ago (Ma), with India sep- 
arating 84 to 91 Ma (3). The Malagasy fossil 
record shows both regional and widespread 
Gondwanan fauna before continental breakup 
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(Fig. 1A) (4) but plant remains are scarce in the 
record (5). The Cretaceous-Paleogene (K-Pg) 
mass extinction (66 Ma), when Madagascar 
had already become an island, is believed to 
have greatly reduced the ancient Malagasy 
fauna. This species turnover presented new 
opportunities for the establishment and ra- 
diation of colonizers (6, 7). Biotic history dur- 
ing this period is almost entirely inferred from 
molecular phylogenies as there is a long gap 
in the fossil record during the Cenozoic (8). 
Molecular clock estimates suggest that few ex- 
tant groups date back to potential Gondwanan 
vicariance, including some reptile, fish, and 
insect lineages (6, 9, 10) and the plant genus 
Takhtajania (11) (Fig. 1A). Most of the current 
animal, plant, and fungal diversity originated 
from ancestors with mainly African and Indo- 
Pacific origin according to phylogenies and 
biogeographic reconstructions, and reached 
Madagascar through overseas dispersal (6, 10-12) 
(Fig. 1B). The presence of oceanic surface cur- 
rents flowing from Africa to Madagascar during 
the Paleogene, which subsided in the Miocene 
(13), coincided with the arrival of multiple ver- 
tebrate lineages that subsequently diversified 
(6, 7). It has also been proposed that short- 
lived land bridges in the Mozambique channel 
during the Neogene may have aided migration 
(14), although the significance of this is debated 
(14, 15). In addition, stepping-stone islands in 
the Indian Ocean, now submerged, may have 
facilitated animal and plant dispersal from the 
Indo-Pacific region (6). 

The current peaks and plateaus of Mada- 
gascar probably formed in the past 30 to 40 
million years (My) through mantle upwelling 
and volcanism, and the past 10 My have seen 
accelerated uplift (77, 18). This suggests that 
rather than evolving on an old stable surface, 
many of the current patterns of biodiversity 
were shaped by environmental gradients and 
dispersal barriers that are relatively young, 
geologically speaking (17). 


Regional differences 


Madagascar’s diverse biota and ecosystems 
have been categorized using many different 
systems (e.g., 19, 20), but data scarcity means 
that any inferences on the extent of native 
vegetation prior to major anthropogenic in- 
fluences come with a very high level of uncer- 
tainty. We summarize the current vegetation 
types of Madagascar (dry forest, grassland- 
woodland mosaic, humid forest, mangrove, 
tapia, spiny forest, and subhumid forest) based 
on a simplified version of the Atlas of the Veg- 
etation of Madagascar (21) (Fig. 1 and table S1) 
(22). Although our resulting simplified map is 
adequate for providing an overview of Mada- 
gascar’s main vegetation types, a higher reso- 
lution map and more detailed classification is 
needed for in-depth analyses such as system- 
atic conservation planning. We suggest that any 
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new mapping classification should build on ex- 
isting mapping [including the updated classi- 
fication of (23)] but follow the suggestions of the 
IUCN global ecosystem typology (24), which is 
a hierarchical classification system that at its 
top level defines ecosystems by ecological func- 
tion and at detailed levels distinguishes ecosys- 
tems by species assemblage (25). 

There is a marked longitudinal rainfall gra- 
dient created by the high eastern edge of the 
mountain range running from north to south, 
most of which exceeds 800 m above sea level. 
Humidity brought by easterly trade winds and 
summer monsoons from the Indian Ocean is 
captured by the edge and forms a cloud layer 
at ca. 900 to 1200 m. This rain-producing sys- 
tem sustains the patchy remains of a ca. 100-km- 
wide band of evergreen humid forest along the 
east coast, with extensions to certain portions 
of the north. Rainfall patterns are largely un- 
predictable throughout the country, and there 
are frequent but irregular cyclones during the 
rainy season. This unpredictability is suggested 
to have led to unique biological adaptations in 
Malagasy species, including extremes of very 
fast or slow life histories (26, 27). 

The Central Highlands have a subhumid 
climate, which is cooler and drier during the 
winter. They are dominated by a grassland- 
woodland mosaic, where grasslands are mixed 
with agricultural land, shrubland, and patches 
of woodland. There are also areas of humid 
forest and tapia—woodland dominated by the 
tree species tapia (Uapaca bojeri)—from which 
the vegetation type takes its name. Although 
grasslands increased as a result of the degrada- 
tion of woody vegetation types following hu- 
man settlement, some are derived from the 
pantropical savanna expansion that started in 
the late Miocene (28). The extent of grasslands 
at the time of human arrival, especially in the 
Central Highlands, remains debated (29). To the 
southwest, the highland mosaic transitions into 
subhumid forests and more extensive tapia. 


The highest mountains (>2500 m) are ig- 
neous in origin and support sclerophyllous 
shrublands dominated by species of the plant 
family Ericaceae in addition to open grass- 
lands around their summits. Humidity and 
rainfall decrease in the rain shadow to the west 
of the Central Highlands, with the dominant 
vegetation type transitioning to dry forest, with 
some deciduous plant species and succulent 
elements toward the western coast. Mangroves 
are mostly found along the Mozambique Chan- 
nel coast. The southwest region is the driest 
part of the island, and the rainy season, when 
present, lasts <3 months. This climate supports 
the spiny forest ecosystem, which in global terms 
is strictly a thicket but classed as forest within the 
context of Madagascar (27). This ecosystem was 
previously thought to be Madagascar’s oldest 
and was widespread across the island when it 
lay at the edge of the tropical belt before the 
mid-Oligocene. When continental drift moved 
Madagascar north and directly into the trade 
wind zone, the spiny forest ecosystem con- 
tracted (3). However, the humid forest has been 
found to contain taxa belonging to lineages 
that date back to the Paleocene, and further 
evidence from climate reconstructions suggests 
that Madagascar was moderately humid at 
the K-Pg boundary (//, 30) (Fig. 1A). 


The arrival of humans 


Human presence in Madagascar—from both 
Austronesian and African origins—dates to at 
least the start of the CE with some evidence 
pointing to the Early Holocene—8000 BCE 
onward (31, 32) (Fig. 3). Settlement in the in- 
terior and large-scale anthropogenic impacts 
likely took place after 1000 CE, with subsequent 
progressive population growth from initially 
sparse settlements from 1200 CE onward 
(33, 34). As in other parts of the world once 
human populations began to expand, their 
activities had substantial impacts on local biota. 
This process resulted in landscape transforma- 


tion from ca. 300 CE onward (35, 36) and sub- 
sequent extinction of Madagascar’s once rich 
megafauna (here defined as vertebrates >10 kg) 
through a combination of hunting and habitat 
displacement (34, 37-40). These extinctions may 
have accelerated as a result of a shift from 
hunting and foraging to herding and farm- 
ing as the predominant methods of obtaining 
food, which brought land clearance and trans- 
formation to agricultural land (42. Drought may 
have further compounded these changes (42). 

Since settling on the island, humans have 
introduced crops and livestock for agriculture 
and husbandry (43-45) (Fig. 3). Of these, rice 
and zebu cattle have had the largest impacts 
on the landscape (43, 44) as a result of their 
vital role in sustaining human populations. 
Rice is currently widely cultivated both in the 
Central Highlands (using paddy production) 
and in the humid east, where swidden agri- 
cultural methods are used (i.e., shifting culti- 
vation involving clearing forest for conversion 
to cropland, usually by burning). With the lat- 
ter practice, soils are rapidly depleted and re- 
main fertile for only a short period, meaning 
that the land is abandoned for long fallow pe- 
riods with further vegetation being cleared at 
anew location. The expansion of the Kingdom 
of Madagascar in the late 1700s, followed by 
British and French colonialism in the 1800s 
and 1900s, accelerated trade and landscape 
transformation, resulting in a substantial loss 
of native vegetation across the island (33). Cur- 
rent patterns of Madagascar’s biological diversity 
are therefore shaped both by ancient evolution 
and recent anthropogenic activities. 


Contemporary patterns of richness, 
endemism, and use 


Madagascar is one of Earth’s “hottest” biodi- 
versity hotspots (46), with high species rich- 
ness and exceptional levels of endemism across 
many taxonomic groups, combined with high 
rates of habitat degradation and fragmentation 
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Fig. 1. Timing and origins of Madagascar’s biodiversity. (A) Geological and 
environmental events in relation to the age of multiple organismal groups. The 
dark yellow horizontal bars at the bottom show the timing of landscape and 
climatic events. Vertical yellow shading along the panel corresponds to longer 
geographical events. Bars and lines show crown and stem ages of 217 lineages that 
each produced at least two endemic Malagasy species, estimated from molecular 
and fossil data. Icons correspond (from top to bottom) to nonflowering vascular 
plants, flowering plants, mammals, birds, dinosaurs (for fossil data), reptiles 
(here all Sauropsida, excluding birds), amphibians, arthropods, bony fishes, 
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mollusks, and flatworms. In the fossil data section, the empty bars show the 
number of unique species in the fossil record through time that were found in 
Madagascar, with filled bars showing the number of unique species endemic 

to Madagascar. PL, Pleistocene; PLI, Pliocene; MIO, Miocene; OLI, Oligocene; 

EOC, Eocene; PAL, Paleogene; ICRE, late Cretaceous; eCRE, early Cretaceous. 

(B) Geographical origins of Madagascar’s biodiversity. These treemaps show the 
proportional origins of the 217 endemic lineages in (A), estimated through 
biogeographic reconstruction, or if unavailable, the distribution of the sister group. 
Unsaturated hues represent the proportion of lineages whose origin is ambiguous. 
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Fig. 2. Map of predominant vegetation types, expanded and simplified from Moat and Smith (21). 


(Fig. 4) (46, 47). Despite the global significance 
of Malagasy biodiversity, many taxonomic 
groups remain poorly known, and Madagas- 
car ranks among the top countries for the pre- 
dicted percentage of terrestrial vertebrates 
lacking scientific description (48). Most species 
are represented by only a small number of 
records in global natural history collections 
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and some groups remain practically unknown, 
particularly fungi and most invertebrates. 
Estimates place the global number of fungi 
at >6.3 million species (49), and Madagascar is 
likely to hold a large proportion of this diver- 
sity. However, to date <2000 fungal species 
and species hypotheses—the latter defined by 
genetic reference sequences (50)—have been 
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reported in public databases (57, 52) and check- 
lists (63, 54). 

Concerted efforts, including taxonomic re- 
search, improved digital access to natural his- 
tory collections, and application of molecular 
techniques for species identification and de- 
limitation, have resulted in a substantial in- 
crease in the number of records and species 
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Fig. 3. Human arrival. Holocene events and environmental changes around the time of human arrival. Dates 
for human introductions of dogs, zebu cattle, rats, bushpigs, goats, and rice are provided as well as last dated 
records of megafauna (hippopotamus, elephant birds, giant tortoises, and giant lemurs) (22). 


new to science in recent years, even in rela- 
tively conspicuous groups such as reptiles and 
amphibians (Fig. 5). However, many species 
remain undescribed across most taxonomic 
groups (55, 56). For example, as of June 2021, 
there were 369 described native Malagasy 
amphibians (57) but the true number has been 
estimated to be well over 500 (58). The figures 
for undescribed species of arthropods could be 
orders of magnitude higher. Of the estimated 
1300 species of ants alone (59), only 781 have 
been formally described (60). 

For Malagasy grasses, concerted herbarium 
digitization efforts over just three years re- 
sulted in a 43% increase in georeferenced spe- 
cies records. This more than doubled the median 
number of records per species and improved the 
resolution of species richness patterns (28, 67). 
In better-studied groups such as lemurs, con- 
tinued advancements in our understanding of 
their distribution, ecology, and genetic diversity 
allow us to better understand their evolution- 
ary history and inform conservation strategies 
(62). Together, these efforts show the crucial 
role of field- and collections-based research 
in advancing biodiversity knowledge and 
understanding of spatial patterns of richness, 
endemism, and speciation, while providing 
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opportunities to further investigate the eco- 
logical roles of species across Madagascar’s 
ecosystems. 


Extensive endemism 


Among the 1314 native species of terrestrial and 
freshwater vertebrates (4), levels of endemism 
are extremely high (90% overall); all native 
nonflying terrestrial mammals and native am- 
phibians are found nowhere else on Earth, 
and 56% of birds, 81% of freshwater fishes, 
95% of mammals, and 98% of reptile species 
are endemic (4, 63-68) (Fig. 4). Little is known 
about endemism in insects, but data from the 
few well-studied groups on the island suggest 
that it is similarly high (69, 70). Endemism among 
Madagascar’s animals is not limited to lower 
taxonomic levels: Among birds, the island con- 
tains one endemic order (Mesitornithiformes) 
and three endemic families (Brachypteraciidae, 
Philepittidae, and Bernieridae) (72). Among mam- 
mals, higher-level endemism includes the super- 
family Lemuroidea, the families Myzopodidae 
(sucker-footed bats), Eupleridae (native Carniv- 
ora), and Tenrecidae (tenrecs), and the subfamily 
Nesomyinae (nesomyine rodents) (66, 68, 72, 73). 
For amphibians, in the family Mantellidae 
(mantellid frogs) all but three species (endemic 
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to the Comoro islands) (74, 75) are endemic to 
Madagascar; there are also three endemic sub- 
families: Cophylinae (narrow-mouthed frogs), 
Dyscophinae (tomato frogs), and Scaphiophry- 
ninae (rain frogs) (63). 

Malagasy flora is also highly diverse and 
mostly endemic (76). It is estimated that over 
14,000 vascular plant species occur on the is- 
land (76), including 11,516 described native spe- 
cies, of which 82% are endemic (22, 77). When the 
estimated 2550 species that remain to be sci- 
entifically described are factored in, the level 
of endemism could rise to 87% (76). Among the 
island’s flowering plants (angiosperms), there are 
310 endemic genera, ca. 19% of the generic diver- 
sity (1D; and five endemic families (Asteropeiaceae, 
Barbeuiaceae, Physenaceae, Sarcolaenaceae, 
and Sphaerosepalaceae). Five families dom- 
inate the flora in terms of species richness: 
Orchidaceae (orchid family, 922 spp., 84% en- 
demic), Rubiaceae (coffee family, 806 spp., 93% 
endemic), Fabaceae (pea family, 603 spp., 76% 
endemic), Poaceae (grass family, 541 spp., 50% en- 
demic, 40% after specialist taxonomic evalua- 
tion) (78), and Asteraceae (daisy family, 529 spp., 
83% endemic) (5, 76, 77, 79). These are also the 
five largest families globally but all five are dis- 
proportionately species rich in Madagascar 
relative to the land area (~0.4% of Earth’s total). 
The Malagasy bryophyte flora is less well studied 
but is also diverse: of the 1215 described bryo- 
phyte species (767 mosses, 443 liverworts, and 
5 hornworts), 28% are endemic (80). 

Endemism in Malagasy fungi is hard to as- 
sess given that so little is known about the total 
diversity of species. However, 14% of the species 
in the Global Biodiversity Information Facility 
(GBIF) and almost 75% of the fungal species 
hypotheses detected by environmental sequenc- 
ing have not been reported as occurring out- 
side of Madagascar (22). A recent molecular 
assessment of fruiting fungi and root samples 
from five forest sites in Madagascar based on 
Internal Transcribed Spacer data (12) found 
similar levels of endemism, with 65% of se- 
quences not known from outside the country 
and 10% of species potentially new to science, 
with much of the new diversity extrapolated 
from ectomycorrhizal samples. This further 
highlights the possible magnitude of unknown 
diversity among Malagasy fungi. 


Spatial patterns of Malagasy biodiversity 


Biodiversity is not evenly distributed across 
Madagascar, with much of the island’s biota 
occurring in humid forests in the east as well 
as on the eastern flanks of the Central High- 
lands and in some northern areas such as the 
Tsaratanana and Marojejy Massifs (79-82) 
(Fig. 4). Overall patterns of species richness 
correspond closely to collection effort, and 
the variation in sampling frequency across the 
country therefore makes it difficult to ascer- 
tain true patterns of diversity in many groups 
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Fig. 4. Diversity patterns. (A) Species richness and endemism of six taxonomic 
groups in Madagascar. Native terrestrial and freshwater species counts and 
percentages of endemic species are based on estimates using author-curated data 
compiled from The New Natural History of Madagascar (126), and the Catalogue of 
the Vascular Plants of Madagascar (77). Species richness maps were generated from 
species distribution models based on specimen occurrence records and bioclimatic 


data; non-native and marine taxa are not included (22). Numbers in parentheses 
below color ramps are the number of species used to generate the species richness 
maps. (B) Patterns of species richness and collection effort for the same six 
taxonomic groups. Map grid cells are 25 x 25 km; cell colors correspond to species 
richness and collection number per cell, based on specimen occurrence records. 
Gray denotes an absence of records for that cell. 


(Fig. 4). Species diversity patterns in amphib- 
ians, reptiles, and primates are closely mir- 
rored by corresponding phylogenetic diversity 
patterns (fig. S3). An exception occurs in water 
beetles, where phylogenetic diversity is nega- 
tively correlated to species richness and en- 
demism, purportedly because narrow endemism 
in this group is the result of recent radiations 
(83). The few studies investigating the distribu- 
tion of phylogenetic diversity in plants present 
varied patterns, some resembling those of ver- 
tebrate groups, whereas others differ marked- 
ly (84, 85). 

The high species richness and endemism of 
many lineages in the humid forests of eastern 
and northern Madagascar reflect the role of 
these ecosystems both as forest refugia during 
glacial maxima (82, 86, 87), and centers of re- 
cent and rapid evolutionary radiations (88-90). 
This scenario is supported by the presence in 
these areas of high but clustered phylogenetic 
diversity in reptiles, mammals, and, to a cer- 
tain extent, amphibians (fig. S3). The grassland- 
woodland mosaic vegetation of the Central 
Highlands is marked by its own distinctive ende- 
mism despite relatively low species richness 
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(78, 91). Certain groups, including reptiles and 
some plant families, such as Fabaceae, Euphor- 
biaceae, and Malvaceae, show additional centers 
of diversity in spiny forests that dominate the 
island’s southwest region (77, 79, 81) (Fig. 4). 
Species endemism across taxa and regions has 
arisen through multiple mechanisms, including 
allopatric speciation across mountain ranges 
(92), between isolated inselbergs (93), and in 
fragments of forests and wetlands created 
during the wet-dry cycles of the Quaternary 
(94, 95). Narrow endemism is also linked to 
adaptive radiation across the island’s steep 
environmental gradients (87, 94, 96). 


Human use of biodiversity 


Madagascar’s rich biodiversity, particularly its 
diverse flora, has provided many opportunities 
for human utilization. Although biodiversity 
is “useful” in many ways (e.g., ecosystem ser- 
vices or nature’s contributions to people, either 
material or nonmaterial), here we report “uti- 
lized species” as those having a documented 
direct use by humans. Of the 40,283 plant spe- 
cies documented as used by humans worldwide 
(97), 1916 (5%) are found in Madagascar—of 
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these, 1596 are thought to be native and 597 
endemic to the island (98). Hundreds of uti- 
lized species have also been introduced, such 
as the Mesoamerican vanilla orchid (Vanilla 
planifolia), brought to Madagascar from the 
island of Réunion by the French in the mid- 
1800s, following the discovery of a method to 
speed up hand pollination by Edmond Albius 
in 1841 (99). Vanilla is the second most expen- 
sive spice in the world, and Madagascar has 
become the largest producer globally (100). 
Vanilla agroforestry is currently expanding, 
especially in northeastern (Sava region) and 
eastern (Analanjirofo and Atsinanana regions) 
Madagascar, which can pose additional threats 
to biodiversity in some cases. However, it can 
also generate opportunities for conservation 
and restoration when undertaken in sustain- 
able and safe settings and accounting for local 
land use history (J00-102). Beyond the wide- 
spread cultivation of a few introduced species, 
the goods and services provided by Madagas- 
car’s flora are especially important for subsis- 
tence in many rural communities (03). 
Documented utilized endemic plants in- 
clude 310 species used for materials (e.g., woods, 
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Fig. 5. Rates of scientific documentation. Percentage of described Malagasy ants, amphibians, reptiles, 
and vascular plants through time, based on year of basionym publication (22). 


fibers, resins) (104), 91 edible species, and an 
additional 120 crop wild relatives that repre- 
sent genetic reservoirs for the improvement of 
food crops. Among the most important edible 
groups, 38 species of yams (Dioscorea spp.) are 
native to Madagascar, 31 of which are endemic 
(105). Most have edible tubers and are widely 
consumed throughout the island, especially 
when primary crops fail (705, 106). Crop wild 
relatives with potential for commercial bene- 
fits include Madagascar’s 65 species of coffee, 
Coffea spp. (107-109), which could be used as 
gene and trait sources for the improvement of 
the two non-native but commercially grown cof- 
fee species, robusta (C. canephora) and Arab- 
ica (C. arabica), for example to confer greater 
climate resilience (7/0). 

Many of Madagascar’s 204 native palm spe- 
cies (99% of which are endemic) are used by 
people and often for multiple purposes, e.g., 
construction materials, fibers, medicine, and 
food (1/1). Structural constraints of palms 
mean that palm exploitation is often fatal to 
the trees. Consequently, palm populations are 
often denuded in otherwise intact habitats as 
a result of selective extraction, which contributes 
to palms being among the most threatened 
of the assessed plant groups in Madagascar, 
with more than 83% of species evaluated as 
threatened (112). 

At least 221 endemic plant species have 
been documented as having medicinal value 
(97, 113-115). These include several species of 
Zanthoxylum, which have antiplasmodial prop- 
erties and are used locally to treat malaria (7/6), 
and the widely cultivated Madagascar peri- 
winkle (Catharanthus roseus), which contains 
diverse and abundant alkaloids used in the 
treatment of some cancers and other diseases 
such as diabetes, high blood pressure, and asth- 
ma (177). Many plant species are used solely in 
traditional medicine practices in Madagascar. 
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Although scientific knowledge remains in- 
complete on the topic, medicinal plant species 
have been documented as being used for a 
wide range of health conditions across many 
regions and ecosystems (03, 118-120), high- 
lighting the effective and potential value of 
Malagasy plant diversity for humanity. 

The human uses of animals are not as ex- 
tensive as those of plants, but hunting for meat, 
especially forest-dwelling species, provides 
an important source of nutrition and protein 
for some communities (727, 122) and exerts 
considerable pressure on wild populations 
(123-125). Consumption of insects—particularly 
orthopterans, lepidopterans, and coleopterans— 
is also widespread. Beyond what we report, 
there are certainly additional potential uses 
of plants that have yet to be published or dis- 
covered, and additional uses of currently uti- 
lized species that have not been documented 
by scientists. The data reported here are cer- 
tainly underestimates. 

Madagascar’s rich biodiversity has diverse 
values. Among them, the multitude of known 
and potential uses reported here reinforce the 
imperative to conserve the unique Malagasy 
biota in the face of major threats such as hab- 
itat loss and overexploitation (2). 


Concluding remarks 


Our synthesis shows that the depth and breadth 
of Madagascar’s remarkable biodiversity— 
the product of millions of years of evolution in 
relative isolation (Figs. 1 and 2)—is still being 
uncovered. Although the scientific community 
has accumulated a great amount of informa- 
tion on some taxonomic groups, others remain 
relatively unknown, particularly fungi and most 
invertebrates. Fundamental information on 
biodiversity and its uses is essential for guid- 
ing conservation action (2). The gathering 
and analysis of these data must therefore con- 
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tinue and accelerate, through equitable prac- 
tices, if we are to safeguard the multifaceted 
aspects of Madagascar’s unique biota. 
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BIODIVERSITY 


Madagascar’s extraordinary biodiversity: 


Threats and opportunities 


Héléne Ralimanana*® et al. 


BACKGROUND: Madagascar is one of the world’s 
foremost biodiversity hotspots. Its unique 
assemblage of plants, animals, and fungi— 
the majority of which evolved on the island 
and occur nowhere else—is both diverse and 
threatened. After human arrival, the island’s 
entire megafauna became extinct, and large 
portions of the current flora and fauna may 
be on track for a similar fate. Conditions for 
the long-term survival of many Malagasy spe- 
cies are not currently met because of multiple 
anthropogenic threats. 


ADVANCES: We review the extinction risk and 
threats to biodiversity in Madagascar, using 
available international assessment data as well 
as a machine learning analysis to predict the 
extinction risks and threats to plant species 
lacking assessments. Our compilation of glo- 
bal International Union for Conservation of 
Nature (IUCN) Red List assessments shows 
that overexploitation alongside unsustainable 
agricultural practices affect 62.1 and 56.8% of 


Expanded biodiversity monitoring 
is needed to safeguard Madagascar's 
most valuable assets. 


Conservation must address the 
root causes of biodiversity loss. 


vertebrate species, respectively, and each 
affects nearly 90% of all plant species. Other 
threats have a relatively minor effect today 
but are expected to increase in coming decades. 
Because only one-third (4652) of all Malagasy 
plant species have been formally assessed, we 
carried out a neural network analysis to predict 
the putative status and threats for 5887 un- 
assessed species and to evaluate biases in 
current assessments. The percentage of plant 
species currently assessed as under threat is 
probably representative of actual numbers, 
except in the case of the ferns and lycophytes, 
where significantly more species are estimated 
to be threatened. We find that Madagascar 
is home to a disproportionately high number 
of Evolutionarily Distinct and Globally En- 
dangered (EDGE) species. This further high- 
lights the urgency for evidence-based and 
effective in situ and ex situ conservation. 
Despite these alarming statistics and trends, 
we find that 10.4% of Madagascar’s land area 
is protected and that the network of protected 


Conservation and restoration should 
not be framed solely around the 
protected area network. ss. 


Ay Improving the 

w., effectiveness of 

“ existing protected 
areas is more 
important than 
creating new ones. 


Investment in 
conservation must be 
based on evidence, 
effectiveness, and 
future challenges. 
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Visual representation of five key opportunities for conserving and restoring Madagascar’s rapidly declin- 
ing biodiversity identified in this Review. The dashed lines point to representative vegetation types where these 
recommendations could have tangible effects, but the opportunities are applicable across Madagascar. 
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areas (PAs) covers at least part of the range of 
97.1% of terrestrial and freshwater vertebrates 
with known distributions (amphibians, fresh- 
water fishes, reptiles, birds, and mammal spe- 
cies combined) and 67.7% of plant species (for 
threatened species, the percentages are 97.7% 
for vertebrates and 79.6% for plants). Comple- 
mentary to this, ex situ collections hold 18% of 
vertebrate species and 23% of plant species. 
Nonetheless, there are still many threatened 
species that do not occur within PAs and are 
absent from ex situ collections, including one 
amphibian, three mammals, and seven reptiles, 
as well as 559 plants and more yet to be assessed. 
Based on our updated vegetation map, we find 
that the current PA network provides good 
coverage of the major habitats, particularly 
mangroves, spiny forest, humid forest, and 
tapia, but subhumid forest and grassland- 
woodland mosaic have very low areas under 
protection (5.7 and 1.8% respectively). 


OUTLOOK: Madagascar is among the world’s 
poorest countries, and its biodiversity is a key re- 
source for the sustainable future and well-being 
of its citizens. Current threats to Madagascar’s 
biodiversity are deeply rooted in historical and 
present social contexts, including widespread 
inequalities. We therefore propose five oppor- 
tunities for action to further conservation in a 
just and equitable way. 

First, investment in conservation and resto- 
ration must be based on evidence and effective- 
ness and be tailored to meet future challenges 
through inclusive solutions. Second, expanded 
biodiversity monitoring, including increased 
dataset production and availability, is key. Third, 
improving the effectiveness of existing PAs— 
for example through community engagement, 
training, and income opportunities—is more 
important than creating new ones. Fourth, 
conservation and restoration should not focus 
solely on the PA network but should also in- 
clude the surrounding landscapes and com- 
munities. And finally, conservation actions 
must address the root causes of biodiversity 
loss, including poverty and food insecurity. 

In the eyes of much of the world, Madagascar’s 
biodiversity is a unique global asset that needs 
saving; in the daily lives of many of the Malagasy 
people, it is a rapidly diminishing source of the 
most basic needs for subsistence. Protecting 
Madagascar’s biodiversity while promoting 
social development for its people is a matter of 
the utmost urgency 
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Madagascar’s unique biota is heavily affected by human activity and is under intense threat. Here, we review 
the current state of knowledge on the conservation status of Madagascar’s terrestrial and freshwater 
biodiversity by presenting data and analyses on documented and predicted species-level conservation 
statuses, the most prevalent and relevant threats, ex situ collections and programs, and the coverage and 
comprehensiveness of protected areas. The existing terrestrial protected area network in Madagascar covers 
10.4% of its land area and includes at least part of the range of the majority of described native species of 
vertebrates with known distributions (97.1% of freshwater fishes, amphibians, reptiles, birds, and mammals 
combined) and plants (67.7%). The overall figures are higher for threatened species (97.7% of threatened 
vertebrates and 79.6% of threatened plants occurring within at least one protected area). International Union 
for Conservation of Nature (IUCN) Red List assessments and Bayesian neural network analyses for plants 
identify overexploitation of biological resources and unsustainable agriculture as the most prominent threats to 
biodiversity. We highlight five opportunities for action at multiple levels to ensure that conservation and 
ecological restoration objectives, programs, and activities take account of complex underlying and interacting 
factors and produce tangible benefits for the biodiversity and people of Madagascar. 


adagascar’s biota, the result of mil- 

lions of years of evolution in relative 

isolation, is both unique and under 

threat. At the same time that the 

scientific description of new species is 
accelerating (7), so is the overall rate of ex- 
tinction (2), and many species may be dis- 
appearing before they are even documented. 
In this Review, we aim to consolidate inform- 
ation on the conservation status of some of 
the main elements of Madagascar’s biodiver- 
sity, evaluate the many and varied threats 
faced by species assessed under the criteria 
for the International Union for Conservation 
of Nature (IUCN) Red List of Threatened Spe- 
cies, and provide some perspectives on future 
opportunities to ensure the future of this hy- 
perdiverse and unique biota. 


Ralimanana et al., Science 378, eadf1466 (2022) 


Threats to Madagascar’s biodiversity 

Madagascar’s biodiversity is in decline, with 
some groups more threatened than others 
(Fig. 1). In our Review of threatened species, 
we follow the IUCN Red List data (3) and 
threat categories (4), unless otherwise speci- 
fied. Threatened species are those listed as 
Critically Endangered (CR), Endangered (EN), 
or Vulnerable (VU). At one extreme, 22% 
(35 species) of assessed birds are threatened, 
whereas at the other end of the scale, 73% 
(66 species) of freshwater fishes and 75% 
(173 species) of magnoliid plants are threat- 
ened. Trees are particularly important in terms 
of their broad ecological functions and human 
uses, and 63% of the 3118 assessed tree species 
in Madagascar are threatened (5). Humans 
have affected the environment since their ear- 
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liest arrival on Madagascar—not only in recent 
years. To avoid a shifting baseline effect, it is 
necessary to view changes in light of human 
settlement beginning hundreds or even thou- 
sands of years ago (1). For example, despite the 
relatively low proportion of bird species cur- 
rently threatened with extinction, Madagascar 
has already lost at least 14 species (7% of all 
species) that were present when humans first 
settled the island (Fig. 1). The rate of anthro- 
pogenic extinction is even higher in mammals, 
with 23 species (10%) extirpated since the first 
human settlement. Vertebrate extinctions in- 
clude the loss of lineages representing millions 
of years of evolution—e.g., the sloth-, koala-, 
and monkey-lemurs (families Palaeopropithe- 
cidae, Megaladapidae, and Archaeolemuridae) 
and two species of hippopotamus (family Hip- 
popotamidae). The extinction of four species 
of elephant birds (order Aepyornithiformes) 
represents the global loss of a functionally 
unique clade (6, 7). Extinctions, especially those 
of megafauna such as these, have broad-scale 
implications for ecosystem functioning (6-8). 

In total, 13 endemic animal species are 
listed as Extinct (EX)—defined as extinctions 
after 1500 AD—and an additional 33 are listed 
as Extinct Prehistorically (EP)—defined as 
anthropogenic extinctions before 1500 AD 
[see (9) for a full list of documented anthro- 
pogenic extinctions before 1500 AD]. A further 
nine have been categorized as Critically En- 
dangered (Possibly Extinct) [CR(PE)]. For plants, 
no species has been assessed as EX, and only 
one species (Aloe silicicola) is categorized as 
Extinct in the Wild (EW). A further 118 plant 
species are listed by IUCN as CR(PE) (111 spe- 
cies) or as Critically Endangered (Possibly Ex- 
tinct in the Wild) [CR(PEW)] (seven species). 
Of those currently listed as CR(PE), five spe- 
cies are present in ex situ living collections, 
and their statuses should therefore be updated 
to CR(PEW) (3, 10). 

Malagasy species feature prominently among 
animal groups that have been considered by 
the EDGE of Existence program (17-13), which 
ranks species according to their evolutionary 
distinctiveness and the level of threat they face 
(EDGE = Evolutionarily Distinct and Globally 
Endangered). Almost one in five species of am- 
phibians (18 species), reptiles (17 species), and 
mammals (17 species) in the top 100 EDGE 
species of each group are found in Madagascar 
(73). Yet, only 1 in 20 (four species) of the top 
100 EDGE species of birds are found on the 
island. 

Given the narrow geographic range of many 
Malagasy species [such as (74)], numerous un- 
detected anthropogenic extinctions are likely 
to have taken place (15), such as CR Aloe spe- 
cies, which may have become extinct in the 
wild since they were last recorded. This may be 
especially pronounced in groups with high lev- 
els of micro-endemism, for example, freshwater 
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fishes and amphibians (76). Ascertaining extinc- 
tion events is difficult because of sampling 
biases, insufficient taxonomic knowledge re- 
garding the morphological features of extant 
species, and the challenges of comparisons 
with fossil and subfossil remnants in certain 
groups, such as frogs (/7). 


Reliability of species conservation assessments 


Conservation assessments rely on taxonomic 
classification, and different opinions on species 
limits and numbers may influence the propor- 
tion of threatened species [such as (/8)]. This 
proportion may also be biased by an overassess- 
ment of well-known and widespread taxa, or, 
alternatively, range-restricted species that are 
more likely to be threatened. To investigate in- 
dications of bias, we calculated the fraction of 
threatened species across different plant groups 
on the basis of two sets of species: taxa with 
full threat-status assessments in the Red List 
compiled by the IUCN and their partners (19) 
and those estimated with a Bayesian neural 
network approach (Fig. 1) (9, 20), which in- 
ferred the threat status for all remaining spe- 
cies. Using this method, we predicted the 
threat status of 8821 species with an estimated 
test accuracy of >65%. All taxa with a full threat- 
status assessment were included, although 
some assessments may be out of date and 
could underestimate threat levels. 

The neural network approach combined with 
current IUCN assessments revealed a similar 
fraction of species inferred to be threatened 
across most taxonomic groups (Fig. 1). Large 
deviations from the proportion of threatened 
species in the current IUCN assessments occur 
in the ferns and lycophytes and, to a lesser ex- 
tent, in the magnoliids. The neural network 
results combined with the known IUCN cat- 
egories predicted a far higher proportion of 
threatened ferns and lycophytes {146 of 306 
species; 47.7% [95% confidence interval (CI): 
38.5 to 56.7% ]} than reflected in published 


IUCN assessments (1 of 33 species; 3.0%), 
which suggests a bias toward assessing more 
common species. In the magnoliids, the com- 
bined results predict a lower proportion of 
threatened species [211 of 294 species; 71.8% 
(95% CI: 68.0 to 75.9%)] compared with pub- 
lished IUCN assessments alone (173 of 225 spe- 
cies; 76.9%), which suggests a bias toward 
assessing rare species in that group. 


Genetic erosion 


The reduction of genetic diversity within spe- 
cies resulting from the extirpation of subpo- 
pulations is a crucial, yet easily overlooked, 
facet of biodiversity loss that is often a pre- 
cursor to extinction. Genetic erosion has nega- 
tive effects on individual fitness, the health 
of populations, and a species’ ability to adapt 
to changing environments, reducing their 
resilience to further change and potentially 
incurring extinction debt (27, 22). In practice, 
genetic factors are not directly incorporated 
into IUCN assessments, which are based on 
measures of the probability of extinction result- 
ing from population declines, restricted geo- 
graphic ranges, and small population sizes (23). 

The reduction in population sizes of wild 
plants and animals, together with their frag- 
mentation and isolation, is generally expected 
to increase inbreeding and genetic load, re- 
ducing genetic diversity and fitness over time 
(22, 24). The few studies of intraspecific diver- 
sity in Malagasy species to date reveal that 
some species have maintained high genetic 
diversity despite habitat fragmentation (25, 26), 
whereas others have relatively low diversity, 
possibly as a result of anthropogenic effects 
(25, 27-29). Results differ even within spe- 
cies, such as in the palm Beccariophoenix 
madagascariensis, in which only some pop- 
ulations show strong signals of inbreeding, 
reflected by an excess of homozygotes (30). It 
is important to note that under some circum- 
stances, population decline may outstrip the 


speed with which genetic diversity is eroded as 
a result of inbreeding. Estimates of heterozy- 
gosity may therefore not indicate the true 
genetic health and long-term prospects of pop- 
ulations when considered in isolation (31, 32). 

A more powerful although less explored ap- 
proach is to use coalescence-based demographic 
modeling, which uses genome-wide data to 
estimate the longer-term trends in population 
size, providing more information than metrics 
of contemporary genetic diversity alone (25, 33). 
In Cheirogaleus dwarf lemurs, genomic anal- 
ysis suggests that four species have experienced 
population size declines in the past 50,000 years, 
with one decline (Cheirogaleus cf. medius) start- 
ing as long as 300,000 years ago—all clearly in 
prehuman times and resulting in lower gene- 
tic diversity (29). By contrast, another geno- 
mic study shows that 5 out of 10 analyzed 
plant species with varying extinction risk have 
experienced substantial population declines 
since human colonization of Madagascar (25). 
In the golden-crowned sifaka (Propithecus 
tattersalli) (26), mouse lemurs (Microcebus 
spp.) (28), Mantella frogs (34), and the Milne- 
Edwards’ sportive lemur (Lepilemur edwardsi) 
(35), demographic declines also appear to have 
taken place after the arrival of humans on the 
island (although the inherent uncertainties 
of mutation rates in the microsatellite data 
used makes the timing of these declines less 
certain). 

The risks of inbreeding and increased gene- 
tic load may represent substantial and likely 
underestimated longer-term threats to the sur- 
vival of Malagasy species. This is especially 
relevant considering the high level of frag- 
mentation of native habitats in some vegeta- 
tion types, such as the humid forests, and is 
worthy of further investigation. 


Predicting future extinction: Direct drivers of loss 


Identifying direct threats is part of the IUCN 
Red List assessment process, and even species 
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Fig. 1. Madagascar’s threatened and lost biodiversity. IUCN Red List 
assessment categories of major groups of plants and animals from Madagascar. 
Assessment categories and coloration follow the standards used by the IUCN 
Red List. Category distributions for animal groups include ray-finned fishes 
(Actinopterygii, freshwater species only, N = 91 species), mammals (Mammalia, 
N = 231), amphibians (Amphibia, N = 296), mollusks (Mollusca, N = 67), reptiles 
(Reptilia, N = 340), arthropods (Arthropoda, N = 374), and birds (Aves, N = 209). 
Category distributions for plants, indicated with saturated, wider bars, include 
magnoliids (N = 225), gymnosperms (N = 6), rosids (N = 1704), monocots 

(N = 822), asterids (N = 1105), other eudicots (N = 81), and ferns and lycophytes 
(N = 33). Thinner, unsaturated bars indicate the relative proportion of plant taxa 
in each threat category for IUCN Red List assessments combined with the taxa 
where the threat category was predicted in a Bayesian neural network analysis: 
asterids (N = 2924), rosids (N = 2990), other eudicots (N = 312), magnoliids 
(N = 294), monocots (N = 1965), and ferns and lycophytes (N = 306). The 
number indicated above each bar with a plus symbol is the number of taxa for 
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which the threat category was predicted using the neural network analysis. IUCN 
Red List assessment categories include LC and NT, together making up the 
not threatened category, whereas VU, EN, CR, CR(PE), EW, EX (i.e., extinct after 
1500 CE), and EP (126) (i.e., extinct before 1500 CE but with dated records 
within the past 130,000 years) make up the threatened and extinct category. 
Silhouettes below the bars depict taxonomic orders with EP, EX, EW, and CR(PE) 
species, with the number of species in each category per order. For some plant 
groups, additional orders with single CR(PE) species are indicated with a 

star. Depicted orders are, from left to right and top to bottom: Perciformes, 
Cyprinodontiformes, Cetartiodactyla, Carnivora, Rodentia, Primates, Afrosoricida, 
Venerida, Unionoida, Squamata, Testudines, Crocodilia, Orthoptera, Spirobolida, 
Araneae, Calanoida, Cyclopoida, Podicipediformes, Cuculiformes, Coraciiformes, 
Charadriiformes, Gruiformes, Anseriformes, Aepyornithiformes, Accipitriformes, 
Laurales, Magnoliales, Pinales, Oxalidales, Sapindales, Myrtales, Malvales, 
Malpighiales, Fabales, Asparagales, Poales, Ericales, Boraginales, Gentianales, 
Asterales, and Saxifragales. 


that are not explicitly threatened [i.e., those 
that are assessed as Least Concern (LC), Near 
Threatened (NT), or Data Deficient (DD)] can 
still have threats listed. Here, we discuss these 
threats and how they apply to all species. Our 
analysis of IUCN assessments indicates that 
overexploitation and agriculture are the most 
frequently listed threats to Malagasy fauna 
(excluding invertebrates) and flora (Fig. 2), 
mirroring global findings (36). Overexploita- 
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tion is unsustainable biological resource use as 
defined by the IUCN (37), including hunting 
and collecting for subsistence use or national 
and international trade. Overexploitation is 
linked in some cases to illegal harvesting— 
for example, the illegal logging of rosewood for 
trade (Dalbergia spp.)—which has been banned 
under the Convention on International Trade 
in Endangered Species of Wild Fauna and Flora 
since 2013 and under Malagasy law since 2010. 
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We estimated that 62.1% of vertebrates and 
87.1% of plants are threatened by overexploi- 
tation and that 56.8% of vertebrates and 87.8% 
of plants are threatened by agriculture. These 
two major threats, almost equal in magnitude 
(Fig. 2), have different modes of impact— 
overexploitation is more targeted and tends 
to occur over relatively restricted areas com- 
pared with the broad effects of land clear- 
ance for agriculture. 


3 of 13 


RESEARCH REVIEW 


Logging 


Overexploitation 


Overexploitation 


System modifications 


Fig. 2. Threats to Malagasy biodiversity. (A and B) A 
as defined by the IUCN, and their associations with maj 
and freshwater vertebrates (A) (1332 species with IUC 

993 species have at least one listed threat) and plants ( 
assessments or predictions, all of which have at least o 


gymnosperms (six species), which could not be visualized]. Widths of the boxes 


and lines reflect the number of species affected by each th 


are further divided into subthreats, whereas only the highest threat classification 


Agriculture, and to a lesser extent over- 
exploitation, are also the primary causes of 
deforestation in Madagascar. Approximately 
44% of the land area covered by native forest 


uvial plots showing threats, 
or groups of terrestrial 
assessments, of which 

B) [9268 species with IUCN 
ne listed threat; includes 


eat. Threats for vertebrates 


of deforestation has steadily increased, reach- 
ing 99.0 kha/year between 2010 and 2014 (38) 
and, according to Global Forest Watch, re- 
mains very high at 72.9 kha/year (2014 to 


in 1953 was deforested by 2014 (38). The rate 
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2020) (39). Deforestation in Madagascar re- 
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was available for assessed plants. The estimates for plants include predictions for 
unassessed species based on a Bayesian neural network analysis (9). The color 
scheme is consistent across panels. The other threat class includes pollution, climate 
change, transportation, and human disturbance, plus invasives and diseases for plants. 
Some threat classes have been renamed for brevity and clarity, including the IUCN 
category “biological resource use,’ which is referred to as overexploitation here and in 
the text for brevity and in line with Intergovernmental Science-Policy Platform on 
Biodiversity and Ecosystem Services (IPBES) terminology (36). 


flects global patterns (40) and is primarily 
driven by the small-scale but widespread prac- 
tice of swidden agriculture (also known as 
shifting cultivation; in Madagascar referred 
to as tavy for rice cultivation in humid and 
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subhumid areas and hatsake for cassava and 
maize in dry and subarid areas). Additionally, 
cash crop production, particularly maize and 
peanut, has become a major driver of defo- 
restation (4/7) alongside the production of 
products for international markets, such as 
forest-derived vanilla (42). The most frequent 
threats listed for plants and vertebrates sug- 
gest that this trend of increasing deforestation 
rates will continue, with forest loss and deg- 
radation a consequence of the clearance of 
land for agriculture—potentially associated with 
small-scale fire activity (43)—and overexploi- 
tation through selective logging and highly 
targeted activities, such as the collection of 
palm hearts. Additionally, natural system mod- 
ifications (threats from actions that convert 
or degrade habitat, e.g., anthropogenic fire 
in forests or changes in water management; 
Fig. 2) add to deforestation, threaten 23.2% of 
vertebrates, and are estimated to threaten 68.9% 
of plants. Some predictions indicate that in the 
absence of an effective strategy against defo- 
restation, 38 to 93% of forest present in 2000 
will be no longer present in 2050 (47). 

For vertebrates, the greatest threat after 
overexploitation and agriculture is invasive 
and problematic species and emerging infec- 
tious diseases (referred to as “invasives/diseases” 
in Fig. 2), which affect 27% of all species (360 
species; Fig. 2). This category includes non- 
native invasive species as well as problematic 
native species and diseases of any origin. 
Changes in habitat because of the spread of 
non-native plant species can have a large ef- 
fect, and one study reports that of a total of 
546 naturalized non-native plants in Madagascar, 
101 have been found to display invasive char- 
acteristics (44). Many non-native plants, such 
as the Mexican yellow pine (Pinus patula) in 
terrestrial systems (45) and the common water 
hyacinth (Pontederia crassipes) in freshwater 
systems (46), are aggressively invasive and 
transformative in seminatural habitats and 
are clearly affecting native fauna and flora. 
Even within reserves and protected areas 
(PAs), the issue can be pronounced. For ex- 
ample, three species of invasive or problematic 
plants—strawberry guava (Psidium catileyanum), 
Molucca raspberry (Rubus moluccanus), and 
wild cardamom (Aframomum angustifolium)— 
together occupy 17.6% of the Betampona Na- 
ture Reserve (47) and are also widespread in 
Ranomafana National Park and other PAs. 

Not all impacts are negative, however, and 
there is some evidence to suggest that, because 
of their potential for faster growth, some non- 
native plants are better able to combat the 
rapid fragmentation of native vegetation and 
may be beneficial for endemic vertebrates, 
providing refuge, food, and vegetation corri- 
dors, while also improving human livelihoods 
(48). The potential for such species to become 
invasive or readily burn must, however, be 
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fully considered before embarking on any 
planting initiatives (49). In addition, effects 
must be considered at different scales. For 
example, the presence of strawberry guava 
has been reported to locally increase species 
richness in frugivores, but because they are 
primary dispersers of the seed, this further 
contributes to the spread of and associated 
changes in floral and faunal community struc- 
ture and reduction in taxonomic richness (50). 

Non-native vertebrates have also had marked 
and diverse effects, which we illustrate here 
with some examples. Introduced rats (Rattus 
rattus; present since at least the 14th century) 
are now ubiquitous, even in remote areas, and 
there is evidence that their presence is asso- 
ciated with declines in native small mammals 
(51). In freshwater habitats, competition and 
predation by exotic fish species is considered a 
major factor in the decline of native fresh- 
water fish (52), which have been completely 
replaced by non-native species across much 
of the Central Highlands and western areas 
(53). Although not yet listed in current assess- 
ments, the recent invasion of the toxic Asian 
common toad (Duttaphrynus melanostictus), 
along with the predicted vulnerability of most 
native vertebrates to its toxins (54), is expected 
to represent a new threat to many nocturnal 
carnivores. The effects of other introduced and 
naturalized animals on native biodiversity are 
not well studied; this includes widely occurring 
species, such as dogs (Canis familiaris), cats 
(Felis catus), the common myna (Acridotheres 
tristis), and the marbled crayfish (Procambarus 
virginalis). The threat of emerging infectious 
diseases is primarily driven by the occur- 
rence of the chytrid fungus Batrachochytrium 
dendrobatidis, widely documented across Mad- 
agascar over the past decade and a potential 
threat to all amphibians, although no mass 
mortalities associated with chytridiomycosis 
have been reported in the country (55). Spe- 
cies often face multiple threats at the same 
time, although the effect of each threat can 
vary between species (Fig. 2). 

Among vertebrates, amphibians have the 
highest number of IUCN-identified threats per 
species (Fig. 2A), with a mean of 4.8 threats 
per species, followed by mammals (mean of 
2.5 threats per species) and reptiles (mean of 
2.2 threats per species). For plants (Fig. 2B), 
magnoliids have the most threats per species 
(mean of 2.9 threats per species) followed by 
rosids (mean of 2.8 threats per species) and 
other eudicots (mean of 2.8 threats per spe- 
cies). Although there might be some variation 
in the perception and documentation of threats 
between the specialists carrying out assess- 
ments, all follow the same protocols (4). 

The number and relative impact of these 
threats may change in coming decades. The 
effect of climate change on Malagasy biodiver- 
sity remains understudied, and it is not cur- 


2 December 2022 


rently indicated in IUCN assessments as a 
major threat. However, this impact is expected 
to increase in the future (56-59) and could 
potentially result in synergistic negative effects 
with unsustainable agriculture associated with 
land clearance, invasive alien species, and in- 
appropriate management of fire regimes that 
can increase future fire risk (43, 56, 57, 60). Ex- 
tinctions in one group could also have effects 
on others that depend on them, such as in 
cases of strong plant-animal mutualisms (67, 62). 
Although coextinction is hard to quantify, with 
substantial knowledge and data gaps (63), 
models suggest that the effects of extinction 
can be amplified as a result of the interactions 
between species within and between trophic 
levels, with the potential to lead to secondary 
and even cascading extinctions (64, 65). 


Conservation efforts and effectiveness 
Protected areas 


PAs are the central political and scientific ac- 
complishment of Madagascar’s conservation 
strategy. The network has been continuously 
developed since the first PA was established 
in 1927 (66-70). Our data compilation shows 
that the network now encompasses 10.4% of 
the land area of Madagascar, having grown 
by more than a third over the past two dec- 
ades (Fig. 3). This recent and extensive de- 
signation of new PAs was carried out through 
a multistakeholder consultative process, in 
combination with data and literature analyses, 
through the Durban Vision initiative con- 
ceived in 2003. In addition to preserving di- 
verse ecosystems and landscapes, the focus 
has been on species groups for which suf- 
ficient diversity and distribution data were 
available, primarily vertebrates (including birds, 
mammals, amphibians, and reptiles) and some 
plant groups. Despite the production of con- 
siderable data since the Durban Vision began 
[e.g., many newly described species (7)], the 
network designed during that process remains 
highly taxonomically comprehensive. From a 
global perspective, the PA network also excels 
at capturing the vast majority of Madagascar’s 
many EDGE species: 14 of 18 amphibians, 
15 of 17 reptiles, 16 of 17 mammals, and all 
four birds (73). 

As of November 2020, there were 110 ter- 
restrial PAs with permanent protected status 
in Madagascar, covering 61,300 km? across the 
country (Fig. 3) (69, 71, 72). Eleven of these are 
orphan PAs—sites abandoned by their former 
managers, with responsibility reverting to the 
Ministry of Environment and Sustainable 
Development (69). An additional 89 sites 
(15,200 km”), predominantly made up of Key 
Biodiversity Areas (KBAs), are not under for- 
mal protection (69, 71, 73, 74). 

The long-term security and effective man- 
agement of Madagascar’s PAs is therefore cru- 
cial to addressing the country’s biodiversity 
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Fig. 3. Madagascar’s terrestrial PAs in the context of human population 
density and changes in coverage of vegetation type over time. (A) PAs 
with IUCN protected status (127), orphan status, or no formal protection 
status (e.g., unprotected KBAs) shown in the context of nearby marine 
PAs, surrounding bathymetry (128), coral reefs (129), cities, roads, and 


challenges. Providing evidence of their effec- 
tiveness and cobenefits, such as ecosystem 
service provision, will be critical to securing 
ongoing support and management from local 
communities as well as from local and national 
governments. However, measuring PA effec- 
tiveness is challenging (e.g., its effectiveness at 
avoiding deforestation or providing alternative 
livelihoods) while accounting for numerous 
covariates (75), particularly in Madagascar 
with comparatively little long-term biodiver- 
sity monitoring data (76). Recent counterfac- 
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IUCN category 


for protected status 


future (73, 74). 


tual analyses (77) have sought to address this 
question by identifying protected and nonpro- 
tected sites that are similar across multiple 
social and environmental variables and then 
comparing indicators of conservation effec- 
tiveness, such as deforestation rate. These 
analyses indicate that PAs have a small but 
important role in reducing deforestation (9). 
We show that since 1990, human impacts 
have measurably increased across all terres- 
trial PAs (table S8) (9), a trend documented 
worldwide (75). Human activity by local com- 
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not currently under formal protection, mostly KBAs) were protected in the 


munities inside PAs is not necessarily detri- 
mental to biodiversity, and land use and 
conservation are therefore not mutually exclu- 
sive. Nevertheless, land conversion and unsus- 
tainable exploitation remain major drivers of 
biodiversity loss. This suggests that protecting 
and realizing the potential of Madagascar’s 
comprehensive PA network will require the 
application of rigorous monitoring and evalu- 
ation strategies matched with extensive com- 
munity collaboration to understand cobenefits 
and minimize detrimental human effects. 
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Scores for deforestation and management 
effectiveness—for example, from the self- 
reported Management Effectiveness Tracking 
Tool (78)—have been the main metrics used to 
monitor effectiveness to date. However, these 
are not always reliable indicators of manage- 
ment effectiveness (76). New and expanded 
capacity of variables, such as remote-sensed 
fire and stable night lights, with increased 
temporal resolution offer promising new mo- 
nitoring opportunities. How fire is associated 
with land transformation in Madagascar has 
been discussed in the literature but has only 
recently been quantitatively assessed (43), dem- 
onstrating that tree loss anomalies are high- 
est in environments where landscapes-scale 
fire (>21 ha) does not occur and where the 
role of small-scale fires (<21 ha) requires close 
and urgent investigation. We show that trends 
in anthropogenic fire are variable, increasing 
in some areas of forest vegetation in the north, 
east, and west but decreasing in grassland- 
woodland mosaic vegetation across central 
Madagascar (Fig. 4, A and B). Forest loss also 
reflects this pattern, primarily occurring in 
the humid forest biome in the east but also in 
dry forest and spiny forest in the west (Fig. 4, C 
and D). Deforestation and land use conver- 
sion remain key challenges to conservation in 
Madagascar, and improved remote sensing 
will accelerate monitoring and developing an 
understanding of the effectiveness of PAs and 
other conservation measures. 


Ex situ conservation and restoration 


Living plant collections in botanic gardens 
and seed banks represent invaluable sources 
of taxonomic and genetic diversity for imme- 
diate conservation and research and should 
continue to support restoration efforts. Glob- 
ally, 29.6% of all known native Malagasy plant 
species (23.1% of endemic species and 23.1% of 
native threatened species) are held in botanic 
gardens, with 15.5% held in Madagascar (10), 
where their cultivation is sometimes linked 
to educational programs and community en- 
gagement essential to raising awareness of 
biodiversity and conservation issues. The Mil- 
lennium Seed Bank Partnership in Madagascar, 
initiated in 1996, hosts collections of an esti- 
mated 3500 native Malagasy species, includ- 
ing members of four of the five endemic plant 
families and all seven of the iconic baobab 
species (Adansonia spp.). The single Malagasy 
plant species listed as EW, Aloe silicicola, now 
only survives in one living collection outside 
Madagascar. 

For native terrestrial and freshwater ver- 
tebrates, 9% of amphibians, 17% of mammals, 
20% of reptiles, 21% of freshwater fishes, and 
33% of birds are currently held in zoological 
collections (18% overall) (9, 79). Many are 
part of active breeding programs, but only 3% 
of amphibians, 7% of reptiles, 11% of fresh- 
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Fig. 4. Recent changes and patterns in burned area and tree cover in Madagascar. (A) Average burned 
area in the period 2003 to 2019. (B) Statistically significant trends in burned area (MODIS) (131) from 
2006 to 2016, not explained by precipitation change (TRMM) (132), dates chosen for comparison with 
Goodman et al. (71). Red indicates an increasing trend, and blue indicates a decreasing trend. (€) Change in 
tree cover from 2000 to 2012 (133). (D) Vegetation map, inferred and simplified from Moat and Smith (134). 
The legend indicates the percentage of each vegetation category currently covered by the PA network. 


water fishes, 13% of mammals, and 23% of 
birds were successfully bred during 2020 (9). 
Unsurprisingly, the species held in captive 
breeding facilities are biased toward the more 
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charismatic, well-known taxa (80). For exam- 
ple, among amphibians, 13 of the 34 species in 
zoos belong to the genus Mantella, a group of 
strikingly colored diurnal frogs, even though 
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Mantella contains only 4% of Madagascar’s 
amphibian fauna. Freshwater fishes, amphib- 
ians, and reptiles are highly suitable for tar- 
geted ex situ breeding and reintroduction 
programs (81-84). For species in these groups 
and others with high levels of micro-endemism, 
such conservation programs continue to repre- 
sent a major safeguard against extinction (85). 
This complies with the One Plan Approach to 
species conservation proposed by the IUCN 
SSC Conservation Planning Specialist Group, 
which supports the development of conser- 
vation and management plans for all pop- 
ulations of a species, even outside of their 
natural range (86). It should be noted that 
the success of reintroduction relies also on 
the maintenance of natural habitat and func- 
tional diversity at potential reintroduction 
sites, along with the minimization of risks 
associated with invasive species and infec- 
tious diseases. In addition, particularly for 
mammals, vulnerability of captive-bred pop- 
ulations to predation can also jeopardize the 
success of reintroductions (87). 


Progress toward international 
conservation commitments 


Madagascar continues to make progress toward 
Convention on Biological Diversity targets 
but, like most countries, falls short of meeting 
them in full (88). Of particular relevance is 
that Madagascar did not formally meet Aichi 
target 11 to protect at least 17% of its total land 
area (Fig. 3)—as was the case for 48% of the 
parties reporting their progress (88). If areas 
designated as important for biodiversity but 
not currently under formal protection were 
also given protection, the total percentage of 
PA coverage would rise from the current 10.4 
to 13% (Fig. 3B). However, given that even the 
existing network is widely considered to be 
chronically under resourced, this action is not 
a priority for the near future (89, 90). 

Target 4 of the Global Strategy for Plant 
Conservation (GSPC) seeks to protect 15% of 
each vegetation type. This has been achieved 
for mangrove (currently at 29.4%), spiny forest 
(21.5%), humid forest (18.5%), and tapia (17.9%) 
but not for dry forest (13.3%), subhumid forest 
(5.7%), and grassland-woodland mosaic (1.8%) 
(table S6) (9). However, expansion of the areas 
of those vegetation types under protection may 
not be feasible because of limited financial re- 
sources, the large degree of fragmentation and 
geographical spread of habitats, and the long 
administrative process involved in extending 
PAs or designating additional areas, as well as 
a lack of political will. It also may not be de- 
sirable until it can be demonstrated that the 
existing PAs are well resourced, achieving con- 
servation objectives and providing benefits to 
communities. Restoration within current PAs 
may provide a longer-term pathway to meet- 
ing this goal, particularly where there are ra- 
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pidly realizable socioeconomic benefits, such 
as sustainable silk production from wild native 
silkworms (Borocera cajani) associated with 
tapia (Uapaca bojeri) in the Itremo Massif PA 
and Ambatofinandrahana KBA. Other targets 
are more difficult to assess because of a lack 
of data. For example, there is very little evi- 
dence to assess success in the control of in- 
vasive alien species, with some exceptions such 
as the ongoing but promising house crow 
(Corvus splendens) eradication (97). Although 
most of the Aichi and GSPC targets were either 
not achieved or cannot be assessed, a marked 
success is that Madagascar has comfortably 
achieved GSPC target 7 (at least 75% of known 
threatened plant species conserved in situ), 
with our analyses indicating that this per- 
centage is currently at 80%. 


Realizing the benefits of biodiversity for people 


The majority of Madagascar’s more than 28 
million inhabitants live outside of, but often 
very close to, PAs (92) (Fig. 3A and fig. S1). 
These communities face challenges connected 
to widespread poverty, which itself is related 
to the degradation of natural capital in the 
landscape, limited access to formal education 
and health care, crime, corruption, weak gov- 
ernance, and regulatory issues including land 
tenure (5, 93, 94). For example, southern 
Madagascar is severely affected by food and 
water insecurity, which catalyzes political and 
social instability, exacerbates economic inse- 
curity, and has led to large-scale migration 
within the country (95). This instability like- 
wise hampers the operations of local, national, 
and international conservation organizations, 
which could be compounded further by ad- 
verse effects from climate change (59). Because 
the human population in the country is ex- 
pected to reach 42 to 105 million by the end of 
this century, of which half will be under 15 years 
of age and with the majority under the poverty 
threshold (96), the conservation success of PAs 
will be inextricably linked to the effective pro- 
vision of livelihoods, food security, and natural 
capital—a situation echoed across all Malagasy 
ecosystems and the world over (97). 


Looking back, moving forward 


Despite decades of research and applied con- 
servation programs supported through substan- 
tial financial investments (94, 98), Madagascar’s 
remarkable biodiversity continues to face sev- 
ere challenges (Figs. 1 and 2). It is reasonable 
to ask whether more of the same tactics—even 
if better resourced and underpinned with greater 
scientific understanding and technology—are 
likely to deliver a tangible reversal in Madagascar’s 
trajectory of biodiversity loss, or whether new 
approaches are required to bring transforma- 
tive change (99), including greater emphasis 
on monitoring interventions and addressing 
underlying drivers through key leverage points. 
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The responsibility for averting humanitarian 
and biodiversity crises is a shared global chal- 
lenge (36, 100), with solutions needed at all 
societal levels—including through local com- 
munities, engagement of the private sector, 
sound leadership and policy from regional and 
national governments, steady international sup- 
port for conservation, and increased recogni- 
tion of how historic and ongoing global and 
national inequalities have contributed to the 
current situation. Scientific data and evidence 
will continue to make a vital contribution, but 
it is crucial that this is done in an interdis- 
ciplinary context, with open communication 
channels to relevant government departments 
and third-sector organizations. 


Decades of progress in biodiversity science 
and conservation 


We now have a clearer and more detailed 
understanding than ever before of the past 
and present diversity and distribution of 
Madagascar’s biodiversity and the threats it 
faces (1) (Fig. 1). The underlying data are the 
product of decades of research—with an in- 
creasing number of Malagasy biologists in- 
volved. This body of research and the evidence 
we have collated and presented here makes a 
clear case for Madagascar as one of the world’s 
foremost conservation priorities. 

Despite multiple competing demands on 
land, the Malagasy government, in collabora- 
tion with a broad group of conservation or- 
ganizations and donors, has succeeded in 
designating 10.4% of the country as terrestrial 
PAs in a network that is largely representative 
of Madagascar’s diverse biomes (Figs. 3 and 4). 
Most terrestrial and freshwater vertebrate spe- 
cies with known distributions have ranges 
that overlap with least one PA (94.7% of rep- 
tiles, 97.2% of amphibians, 98.1% of mammals, 
98.9% of freshwater fishes, 100% of birds, and 
97.1% for all groups combined) as do the maj- 
ority of plants, although to a lesser extent 
(67.7%) (9). For threatened species with known 
distributions, the percentages are similar for 
vertebrates (94.3% of reptiles, 99.3% of amphi- 
bians, 97.7% of mammals, 100% of freshwater 
fishes, 100% of birds, and 97.7% for all groups 
combined) and markedly higher for plants 
(79.6%). Nonetheless, there are still many 
threatened species with ranges that do not 
overlap with the existing PA network, includ- 
ing one amphibian, three mammals, seven 
reptiles, and 559 plants (9) as well as many 
more that have not yet been assessed but may 
be threatened. The ranges of all birds over- 
lapped with at least one PA; this was also true 
when we filtered the analysis to only include 
resident and breeding areas (9). 

Since the loss of Madagascar’s terrestrial 
megafauna (here defined as vertebrates larger 
than 10 kg), there have been few documented 
modern extinctions, but many species have 
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perilously reduced population sizes. The con- 
tinued increase in new species descriptions 
suggests that there may be undocumented ex- 
tinctions, especially in poorly studied taxa (J). 
Despite this, with limited resources and/or 
capacity, Madagascar has made important 
progress toward achieving international cli- 
mate, biodiversity, and sustainable develop- 
ment goals, providing a foundation on which 
to build in the coming decades. 

Success stories for individual species high- 
light how positive collaborative efforts can 
avert extinction. Examples include work on the 
Madagascar pochard (Aythya innotata) (101), 
which shows a 30% probability that extinction 
was prevented because of conservation action; 
the success story of the community-based pro- 
tection of the tahina palm or dimaka (Tahina 
spectabilis), where local communities were in- 
volved in propagation and population rein- 
forcement (J02); and the work to prevent 
the extinction of the ploughshare tortoise 
(Astrochelys yniphora) through a captive breed- 
ing program (03). 

Other notable successes have come from 
Madagascar’s biodiversity conservation boom, 
which started in the 1980s and included a 
growth in the number of students pursuing 
university-level education in environmental 
sciences, biodiversity conservation and man- 
agement, and related fields at both public 
and private universities. The result is an in- 
creasingly robust national capacity for the 
conservation and management of biodiversity 
that extends to international conservation or- 
ganizations, which have been able to actively 
recruit Malagasy professionals to the highest 
administrative and executive positions. Going 
beyond this, the gap in scientific leadership 
that underpins conservation evidence is being 
incrementally filled by Malagasy biodiversity 
scientists. Researchers from outside Madagascar 
are increasingly collaborating with Malagasy 
researchers for mutual benefit. The require- 
ment for international collaborators to provide 
financial and technical support for Malagasy 
researchers and their research infrastructure 
through collaboration protocols—set out in 
the national strategy for scientific research in 
Madagascar (104)—reinforces the importance 
of this. 

As in many low-income countries, insuffi- 
cient public funding means that the number of 
Malagasy professionals is still insufficient to 
serve the country’s needs, there are relatively 
few PhD positions available to students, and 
those that are trained at higher levels often 
move away from academia and into the pri- 
vate sector. Access to up-to-date biodiversity 
data has also been a limiting factor (75). A 
further challenge is how to successfully engage 
multiple parts of society in conservation. Ef- 
forts that are genuinely socially integrated 
have been shown to produce more effective 
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and resilient practices, policies, and decision- 
making, especially in the face of unstable en- 
vironmental, political, and health situations 
(105). The Madagascar Fauna and Flora Group, 
the Lemur Conservation Foundation, the 
Durrell Wildlife Conservation Trust, The 
Peregrine Fund Madagascar, the Madagascar 
Biodiversity Center, and Madagasikara Voakajy, 
as well as the work of the Royal Botanic Gar- 
dens, Kew, and the Missouri Botanical Garden, 
are all examples of successful collaborations 
involving researchers, conservation partners, 
and local communities to protect biodiversity 
and empower local people. 


The future of biodiversity in Madagascar 


Meeting the Convention on Biological Diver- 
sity’s Post-2020 Global Biodiversity Framework 
2030 targets and milestones and achieving 
the 2050 goals (106) will be challenging—in 
Madagascar and globally. Evaluating successes 
and failures over previous decades and learning 
from these to prioritize effective conservation 
investment will be particularly important. To 
embrace diverse views and promote inclusiv- 
ity in the identification of future directions, we 
discussed our results and current literature 
among our coauthors and consulted with 
Malagasy and external researchers, conserva- 
tion leaders, and politicians to arrive at five 
main opportunities for the future, which we 
present here. 

1) Investment in conservation and restora- 
tion must be based on evidence, effectiveness, 
and future challenges. Since the 1980s, billions 
of US dollars from international donors and 
conservation organizations, in cooperation 
with the Malagasy government, have been de- 
dicated to protecting the country’s biodiversity 
and creating today’s network of PAs (98, 107). 
However, the effectiveness of many interven- 
tions is poorly understood because impact 
evaluations are absent or lacking rigor. Evalu- 
ating the effectiveness of conservation activ- 
ities is challenging, but it is the subject of 
increasingly sophisticated research efforts 
(75, 77, 108). Nevertheless, it is imperative that 
investments reinforce evidence-based and reg- 
ularly evaluated interventions, requiring greater 
collaboration and co-design between local 
communities, regional and national author- 
ities, researchers, the private sector, and other 
stakeholders. A particular opportunity is to 
frame these evaluations around community- 
based conservation interventions that address 
challenges faced by people and nature in uni- 
son. For example, nature-based solutions (J09) 
for diversified, locally adapted, and sustain- 
able agriculture can help address livelihood 
needs, whereas more efficient stoves can sub- 
stantially decrease the demand on charcoal 
from native forests for cooking and heating 
and, further, may reduce the health hazards of 
smoke inhalation. Such initiatives increase food 


2 December 2022 


and energy security (110) while providing re- 
silience to climate stochasticity (777). Similarly, 
coordinated, community-based fire manage- 
ment and awareness raising can be used to 
help mitigate risk to fire-sensitive forests. 
On-site management is especially important 
for fire mitigation, as shown by a study con- 
ducted during the COVID-19 pandemic (72). 
Fire management also presents the opportu- 
nity to mitigate the effect of exotic species by 
targeting the removal of flammable invasives 
(e.g., Pinus) and guide appropriate tree-plant- 
ing initiatives to avoid fire-prone plantations 
near areas of particular biological importance. 
Such measures can improve the quality of graz- 
ing land for livestock while reducing carbon 
emissions from fire and helping to protect bio- 
diverse habitats. 

2) Expanded biodiversity monitoring is key 
to safeguarding Madagascar’s most valuable 
natural assets. Existing biodiversity data are 
sufficient to characterize major conservation 
challenges and robustly support the orienta- 
tion of conservation efforts in Madagascar. 
Calling for the collection of additional data 
risks delivering diminished returns on invest- 
ment for conservation planning (773). Never- 
theless, from collating the information for this 
Review, we acknowledge a clear need to ad- 
dress gaps in understudied ecosystems, taxa, 
and genetically distinct populations, noting 
that many newly described species are already 
threatened (1/4) and in need of immediate 
protection. Monitoring is also crucial for the 
detection of new non-native and potentially 
invasive species as well as for providing im- 
portant data for the management of those 
that have already taken hold. Increasing con- 
nections with international trading partners 
without concurrent improvements in capacity 
for biosecurity increases Madagascar’s vulner- 
ability to such species (175), and strategies to 
monitor and mitigate these risks while deliver- 
ing near-term benefits are needed. 

Although there are initiatives that provide 
broad overviews of conservation effectiveness 
(108), many conservation interventions lack 
impact evaluations, in part because of a lack 
of robust, long-term monitoring data for bio- 
diversity and social outcomes. The major gap 
is a lack of capacity for robust biodiversity mo- 
nitoring. An example of the increasing value of 
data and coherency in conservation efforts is 
the development of the Madagascar Protected 
Areas website (116), which consolidates much 
of the information about Madagascar’s exten- 
sive network of PAs. But as with many ini- 
tiatives, the key is in long-term financing and 
maintenance of these portals and in ensuring 
that data flow freely and openly to similar, 
global initiatives like Protected Planet (72). 

Biological monitoring needs to be based 
on consistent, repeatable methodologies with 
shared data. This information provides the 
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science-based evidence needed to leverage 
international funding and government policy 
support. Monitoring is one area where new 
technologies will play a key role, such as 
through the increasing availability of near- 
real-time satellite images and small and cost- 
effective unmanned aerial vehicles, which can 
increase visual access to remote areas (1/7). 
Similarly, DNA-based biodiversity surveys, in- 
cluding environmental sampling, can greatly 
improve the speed of site inventories and the 
identification of unknown and understudied 
taxa. Advances in monitoring must be deli- 
vered with improved and centralized manage- 
ment. This should include open-source and 
transdisciplinary data on biodiversity, social 
and conservation governance, and perform- 
ance. These data should be in formats that are 
accessible and useful to practitioners, identify 
relevant baselines, and support evidence-based 
decisions for conservation and restoration. 

3) Improving the effectiveness of existing 
PAs is more important than creating new ones. 
Madagascar has an extensive, evidence-based, 
and highly representative network of terres- 
trial PAs (Figs. 3 and 4). Madagascar’s existing 
PAs already include at least partial ranges of 
a substantial proportion of Malagasy taxa, 
including most Malagasy EDGE species. Fo- 
cusing on improving their quality and effec- 
tiveness will likely lead to positive biodiversity 
outcomes (178), further increasing the already 
measurable effect that PAs have had on bio- 
diversity. By strengthening PAs, biodiversity 
can be conserved across ecosystem, species, 
and genetic levels, all of which are integral in 
long-term conservation, as discussed above. 
Investment in restoration of degraded areas 
within and beyond the existing network (see 
opportunity 4 below) will provide multiple 
benefits for biodiversity and people. This could 
help increase the resilience of habitats to fu- 
ture drivers of biodiversity loss, including cli- 
mate change, while increasing potential ranges 
of many species in parallel. Demonstrating 
the benefits of strengthened PAs to people is 
a likely prerequisite for societal support to 
maintain and improve upon the existing net- 
work while mitigating the risk of future down- 
grading, downsizing, or degazettement (legal 
removal of conservation status) (79). Finan- 
cial benefits that come with strengthened PAs 
must be distributed appropriately and equitably 
within the country’s political and social con- 
texts, with the full inclusion of local commu- 
nities at all stages (118, 120). 

4) Conservation and restoration should not 
focus solely on the PA network. Madagascar’s 
PAs are islands of natural capital in a land- 
scape of degraded natural resources (127) and 
therefore provide vital resources for commun- 
ities living adjacent to them. Traditional fort- 
ress conservation—seeking to protect areas by 
limiting access—is therefore both undesirable 
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and unlikely to be effective. To further reduce 
the detrimental human impacts that exist in 
all PAs (98) (table S8) (9), we argue for strat- 
egies to enhance the natural capital of the sur- 
rounding landscapes, to reduce pressure on 
PAs as providers of basic resources, and to 
increase buffer zones for the species that live 
in and around them. This could include in- 
creasing ecosystem provision, such as produc- 
tive soils, food, fibers, and other materials and 
services such as water flow regulation and 
carbon capture. Such measures would serve 
to address some of the largest threats to spe- 
cies, including the expansion of agriculture 
and overexploitation (Fig. 2). 

In particular, ecological restoration could 
benefit people and biodiversity, particularly 
when targeted to the 89.6% of the country 
that is not protected. It offers potential to pro- 
vide new livelihood opportunities that are far 
from, and independent of, the resources within 
PAs, further reducing pressure on the system 
(122). Notably, restoration should not only tar- 
get those ecosystems that traditionally receive 
the most conservation attention because they 
hold the greatest biodiversity, for example 
forests. Other vegetation types, such as grass- 
lands, where most agriculture takes place, are 
equally vital. Restoration should be carried out 
following best practice and in places where 
people will benefit most—not necessarily only 
adjacent to PAs. Further, restoration should 
include maximizing biodiversity recovery to 
meet multiple goals, using resilient species, 
and working together with local communities 
(49, 123). 

For the species and their inherent genetic 
diversity not covered by the PA network, par- 
ticularly those that are challenging to con- 
serve, such as freshwater fishes and palms, 
ex situ conservation in zoological and botan- 
ical gardens is a vital tool to support conser- 
vation and restoration. For plants, efforts should 
especially focus on the 32.3% of plant species 
that fall outside of the PA network and the 
species that have cultural or economic value 
for people (e.g., crop wild relatives). Promoting 
biobanking for animals and intensifying it for 
seeds, spores, and fungi will not only support 
conservation but also contribute material and 
knowledge to restoration and research (87). 

5) Conservation actions must address the 
root causes of biodiversity loss. Our analysis 
shows that the most frequently listed threats 
to Madagascar’s biodiversity come from over- 
exploitation and agriculture, predominantly a 
result of forest loss and potentially tied to in- 
creases in small-scale anthropogenic fire in 
forests (Fig. 4, A and B) [see also (43)], which 
significantly affects humid forest areas in the 
east and dry forest and spiny forest in the west 
(Fig. 4, C and D). This trend is likely to con- 
tinue unless the root causes of this forest loss 
are addressed. Conservationists and their fund- 
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ers must recognize that food, social security, 
health, and well-being are the utmost priorities 
for rural communities and that PAs will al- 
ways be vulnerable when surrounded by im- 
poverished people living in landscapes with 
eroded natural capital (124). Politicians and 
economists must recognize that sustainable 
and equitable development in Madagascar is 
inextricably linked to, and dependent on, the 
maintenance of ecosystem function and the 
goods and services they provide. Initiatives 
that address these issues by working with 
local communities to identify tailored solutions 
in health, education, and green entrepreneur- 
ship are increasingly successful and should be 
expanded, but they generally lack data and 
evidence from monitoring (see opportunity 
2). Promising approaches include voluntary 
savings and loans; inclusive, sustainable agri- 
cultural development schemes that promote 
stable land ownership and build—rather than 
destroy—natural capital and the ecosystem ser- 
vices it provides; implementation of conser- 
vation interventions, including research and 
monitoring; and PA management that max- 
imizes local employment (98, 123). Such efforts 
will facilitate improved livelihoods for many 
while reducing pressure on the PAs themselves, 
bringing tangible benefits to communities, 
and contributing to sustainable management 
(98, 125). 


Conclusions 


The alarming status of Madagascar’s biodiver- 
sity is the result of multifaceted, unsustainable 
practices that include historic and contempo- 
rary exploitation. In the eyes of much of the 
world, Madagascar’s biodiversity is a unique 
global asset that needs saving; in the daily 
lives of many of the Malagasy people, it is a 
rapidly diminishing source of the most basic 
needs for subsistence. Achieving a sustain- 
able future that benefits people and biodiversity 
is possible by building on and expanding in- 
tegrated, inclusive conservation efforts. Bio- 
diversity is the greatest opportunity and the 
most valuable asset for Madagascar’s future 
development. 
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INTRODUCTION: Vaccines that induce antibodies 
with predefined genetic features and binding 
specificities have promise to combat viruses 
with high antigenic diversity such as HIV, 
influenza, hepatitis C virus, and betacorona- 
viruses. Although these pathogens have eluded 
the development of vaccines that induce broad 
immunity covering their antigenic diversity, 
broadly neutralizing antibodies (bnAbs) have 
been discovered. Such bnAbs bind to relatively 
conserved epitopes on membrane glycoproteins 
of each pathogen, with features of each anti- 
body allowing binding to a particular epitope. 
If vaccines could be developed to consistently 
induce similar bnAbs, preferably in conjunc- 
tion with broad T cell immunity, protection 
against these pathogens might be achieved. 


RATIONALE: bnAbs acquire affinity-enhancing 
mutations when a bnAb-precursor B cell mu- 


a eOD-GT8 60 mer 


Epitope-specific 
B cell sorting 


Affinity analysis Antibody production 


tates and matures from the original naive B 
cell (or “germline”) state. Germline-targeting 
vaccine design aims to induce bnAbs by stim- 
ulating rare bnAb-precursor B cells that have 
antibody genes and other properties needed 
to develop into bnAbs for a specific epitope. 
This “priming” step must generate a pool of 
bnAb-precursor-derived germinal center and/or 
memory B cells that are susceptible to reacti- 
vation by a boost immunogen closer in struc- 
ture to the native viral glycoprotein. Sequential 
boosting with immunogens of increasing sim- 
ilarity to the native glycoprotein then aims 
to guide somatic hypermutation and affin- 
ity maturation to produce bnAbs that target 
the desired epitope. 


RESULTS: We conducted a first-in-human test of 
the germline-targeting strategy by evaluating 
the safety and immune responses of a germline- 


targeting priming vaccine candidate, eEOD-GT8 
60mer nanoparticle adjuvanted with ASO1,, in 
the IAVI GOO1 phase 1 clinical trial. Each par- 
ticipant received two administrations of placebo, 
low-dose vaccine, or high-dose vaccine 8 weeks 
apart. The eOD-GT8 immunogen was designed 
to activate B cell precursors for HIV VRCO1-class 
bnAbs defined by their usage of heavy chain 
variable gene alleles VH1-2*02 or *04 and any 
light chain complementarity determining re- 
gion 3 with a length of five amino acids. We 
collected immune cells from the blood and lymph 
nodes of participants and carried out epitope- 
specific B cell sorting, B cell receptor (BCR) 
sequencing, and bioinformatic and statistical 
analyses. We also produced monoclonal anti- 
bodies and measured their binding affinities for 
the vaccine antigen. The vaccine had a favorable 
safety profile and induced VRCO1-class responses 
in 97% (35 of 36) of vaccine recipients with 
median frequencies reaching 0.1% among im- 
munoglobulin G memory B cells in blood. bnAb- 
precursors shared multiple properties with 
bnAbs and made substantial gains in somatic 
hypermutation and affinity with the boost. 


CONCLUSION: The results establish clinical proof 
of concept for the germline-targeting vaccine 
design priming strategy, support development 
of boosting regimens to generate VRCO1-class 
bnAb responses against HIV, and encourage ap- 
plication of the germline-targeting strategy to 
other targets in HIV and other pathogens. 
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Test of germline-targeting vaccine priming in healthy humans. Immune cells were isolated from recipients of EOD-GT8 60mer vaccine or placebo, and antibody 


sequences from vaccine-binding B cells were analyzed to measure the VRCO1-class bnAb- 


precursor response rate among participants and the frequency of 


VRCOl-class bnAb-precursor B cells among memory B cells (MBCs) in each participant. Somatic hypermutation (SHM) and binding affinity were measured. 
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Broadly neutralizing antibodies (bnAbs) can protect against HIV infection but have not been induced 


by human vaccination. A key barrier to bnAb induction is vaccine priming of rare bnAb-precursor B cells. 


In a randomized, double-blind, placebo-controlled phase 1 clinical trial, the HIV vaccine—-priming 
candidate eOD-GT8 60mer adjuvanted with ASO1, had a favorable safety profile and induced VRCO1- 
class bnAb precursors in 97% of vaccine recipients with median frequencies reaching 0.1% among 
immunoglobulin G B cells in blood. bnAb precursors shared properties with bnAbs and gained somatic 
hypermutation and affinity with the boost. The results establish clinical proof of concept for germline- 


targeting vaccine priming, support development of boosting regimens to induce bnAbs, and encourage 
application of the germline-targeting strategy to other targets in HIV and other pathogens. 


evelopment of a preventative HIV vac- 

cine is needed to end the HIV/AIDS 

pandemic (J). Broadly neutralizing anti- 

bodies (bnAbs), which are Abs that bind 

the envelope (Env) trimer and neutralize 
diverse HIV isolates, have been shown to pro- 
vide sterilizing protection in nonhuman pri- 
mate (NHP) models (2), and infusion of the 
bnAb VRCO1 was shown to protect against 
neutralization-sensitive HIV isolates in hu- 
mans (3, 4). It is widely thought that an ef- 
fective preventative HIV vaccine will need to 
induce bnAbs. 


HIV vaccine design strategies to elicit bnAbs 


bnAbs, like all antibodies, are produced by 
B cells and acquire affinity-enhancing muta- 


tions when a B cell matures from the original 
naive (or germline) state. The discovery that 
most HIV Env proteins have no detectable 
affinity for bnAb germline precursors greatly 
influenced the development of HIV vaccine 
strategies, by indicating that special immuno- 
gens with affinity for bnAb germline precursors 
would be needed to prime bnAb responses and 
that different booster immunogens would be 
needed to select for antibody maturation to 
produce bnAbs (5-/4). 

The HIV vaccine field is now pursuing at 
least three strategies to elicit bnAbs, each of 
which involves sequential vaccination with 
different antigens to guide the immune re- 
sponse through several stages of maturation. 
These strategies include (i) B cell lineage vac- 


cine design, in which the series of immunogens 
derives from the series of Env variants isolated 
from longitudinal analysis of bnAb develop- 
ment in a person with natural HIV-1 infection, 
and the first (priming) immunogen is selected 
to have affinity for the unmutated common 
ancestor for the bnAb lineage in that case study, 
and is usually the transmitted-founder Env in 
that case study (75-20); (ii) germline-targeting 
vaccine design, in which the priming immu- 
nogen is engineered to bind diverse precursors 
within a bnAb class (spanning many lineages), 
and boost immunogens are successively more 
like native Env trimers (13, 14, 21-29); and (iii) 
epitope-focused vaccine design, in which the 
series of immunogens aims to focus responses 
to one or more particular structural epitopes 
on the trimer (30-38). In each strategy, the 
priming stage is critical, because if appropriate 
B cell precursors with potential to develop into 
bnAbs are not stimulated at that stage, then 
the rest of the sequential vaccine will likely 
fail. Experimental medicine (phase 1) clinical 
trials are now underway or planned for each 
strategy, to test priming immunogens or se- 
quential combinations of immunogens for their 
abilities to elicit desired Ab responses. 


First-in-human test of germline targeting 


We conducted a first-in-human test of the 
germline-targeting strategy by evaluating the 
safety and immune responses of a germline- 
targeting priming vaccine candidate, eOD-GT8 
60mer adjuvanted with ASO1,, in the IAVI GOO1 
phase 1 clinical trial. The vaccine immunogen 
is a self-assembling nanoparticle presenting 
60 copies of an HIV gp120 engineered outer 
domain, germline-targeting version 8 (eOD- 
GT8), genetically fused to and arrayed externally 
on an interior lumazine synthase nanoparticle 
(13, 21, 39, 40). COD-GT8 was designed to have 
affinity for inferred-germline precursors to 
VRCO01-class bnAbs (73, 21, 39). VRCO1-class 
antibodies are minimally defined as those with 
heavy chain (HC) V gene alleles VH1-2*02 or 
*04 and any light chain (LC) complementarity 
determining region 3 (LCDR3) with a length of 
five amino acids (13, 14, 41, 42). These sequence 
features define a broad class of antibodies with 
diverse LCs and HCDR3s. In preclinical ex- 
periments, eOD-GT8 was shown to bind to 
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diverse VRCO1-class human naive B cells at 
an average frequency of ~1 in 300,000 naive 
B cells, in 26 of 27 donors tested (96%), and 
with substantial affinity [geomean dissociation 
constant (Kp) of 4 uM] (2/, 43, 44). Adjuvanted 
eOD-GT8 60mer was shown to be capable 
of priming VRCO1-class B cell responses in 
multiple different engineered mouse models 
(27, 39, 45-49), including stringent models 
that mimic two key parameters of human 
vaccination: precursor frequency and affinity 
(27, 46, 48, 49). Adjuvanted eOD-GT8 60mer 
was also shown to prime VRCO1-class responses 
that can be boosted toward bnAb development 
in mouse models (26, 27, 29). However, adju- 
vanted eOD-GT8 60mer failed to elicit VRCO1- 
class responses in NHPs despite inducing robust 
germinal center (GC) B and T cell responses and 
serum responses (50), likely because of the lack 
of a suitable human VH1-2 analog (13, 41, 51, 52) 
and a lower rate of five-amino acid LCDR3s in 
NHPs (50, 52). In the IAVI G001 trial, we sought 
to determine whether human immunization 
with adjuvanted eOD-GT8 60mer is safe and 
effective for inducing VRCO1-class immuno- 
globulin G (IgG) B cell responses. Forty-eight 
participants were immunized with either 20 ug 
eOD-GTS8 60mer and 50 pg ASO1, CN = 18), 
100 ug eOD-GT8 60mer and 50 ug ASO1, 
(N = 18), or the placebo Dulbecco’s phosphate- 
buffered saline (DPBS) sucrose, the buffer dil- 
uent used in the vaccine (WV = 12) (fig. SI and 
table S1). Vaccine or placebo were administered 
at weeks 0 and 8 intramuscularly in the same 
deltoid. The full schedule of procedures for 
safety and immunogenicity evaluation is given 
in table S2. 


Completeness, safety, and reactogenicity 


All but one study participant received both 
vaccinations; one declined the second vacci- 
nation because of a medical diagnosis un- 
related to the trial. Forty-five of 48 participants 
completed all study procedures, and only 1.0% 
of all 768 visits were missed (fig. S1 and table S3). 
No serious adverse events (AEs) were reported, 
and no participants acquired HIV-1 infection 
or developed serum positivity for HIV. Forty- 
seven of 48 participants (97.9%) reported local 
and/or systemic AEs (tables S4 to S6), but AEs 
were generally mild or moderate, resolved in 
most cases within 1 to 2 days, and were con- 
sistent with other vaccines (53). Overall, the 
vaccine had an acceptable safety and toler- 
ability profile. 


Serum antibody responses 


After the first immunization, all vaccine recipi- 
ents, but no placebo recipients, produced serum 
IgG binding antibodies to COD-GT8 60mer and 
monomer and to the eOD-GT8 CD4 binding 
site (CD4bs) epitope, in which CD4bs binding 
was indicated by stronger binding to ECOD-GT8 
than to eOD-GT8-KO1]I, an epitope-knockout 
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mutant that blocks binding by VRC0O1-class 
precursor Abs and bnAbs (43, 46-49) (figs. S2 
and S3 and tables S7 and S8). Vaccine-induced 
responses to EOD-GT8 60mer and monomer 
increased after the second vaccination, whereas 
CD4bs-specific responses remained relatively 
constant (figs. S2 and S3 and tables S7 and S8). 
All participants exhibited preimmunization 
reactivity to the lumazine synthase (LS) base 
nanoparticle, and LS responses increased with 
both immunizations in vaccine recipients (figs. 
S2 and S3 and tables S7 and S8). However, the 
magnitude of the baseline LS reactivity was not 
associated with stronger or weaker responses 
to the eOD-GT8 60mer, eOD-GT8 monomer, 
or the CD4bs epitope (fig. S4). We detected no 
serum neutralizing activity to any of several 
viruses tested (table S9), as expected, because 
the CD4bs in eOD-GTS8 has been substantially 
modified to enable binding to VRCO1-class pre- 
cursors. Induction of binding antibodies to 
eOD-GTS8 and its CD4bs but not neutralizing 
antibodies was consistent with preclinical 
experiments (27, 39, 45, 46, 48-50, 54). We 
concluded that the vaccine was highly immu- 
nogenic and induced class-switched, antigen- 
specific and CD4bs-specific serum IgG responses. 


B cell sorting and receptor sequencing as the 
critical immunological assay 


The major immunological objective of the trial 
was to determine whether the vaccine could 
induce VRCO1-class IgG B cells, defined as IgG 
B cells with VRCO1-class B cell receptors (BCRs). 
Toward that end, we developed an analysis 
workflow to determine and interpret BCR se- 
quences for eOD-GT8 CD4bs-specific IgG B cells 
using single B cell sorting, reverse transcriptase 
polymerase chain reaction (RT-PCR), DNA se- 
quencing, and bioinformatic analysis (figs. S5 
to S10 and tables S10 to S19). For each trial 
participant, we attempted to interrogate eight 
samples with this workflow, including mem- 
ory B cells and plasmablasts (PBs) from periph- 
eral blood mononuclear cell (PBMC) samples, 
and GC B cells from lymph node cells obtained 
by fine-needle aspiration (FNA) (Fig. 1A and 
tables S10 and S11). Complete data from sort- 
ing and sequencing, including quality-filtered 
sequences for the HC and LC of at least one 
CD4bs-specific BCR per sample, were obtained 
for 69.3% (266 of 384) of attempted samples 
(table S20). 


Immunogenicity assessed by antigen- and 
epitope-specific B cell sorting 

In PBMCs, all vaccine recipients produced eOD- 
GT8- and CD4bs-specific IgG memory B cells 
after the first immunization, with frequencies 
significantly higher than in preimmunization or 
placebo recipient samples, indicating vaccine- 
induced responses (Fig. 1, B and C; figs. S11 and 
$12; and tables S21 to S28). Frequencies of 
eOD-GT8- and CD4bs-specific IgG memory 
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B cells in vaccine-recipient PBMCs increased 
significantly after the second immunization 
(Fig. 1, B and C; and tables $29 and S30), al- 
though the fraction of eCOD-GT8-specific cells 
that were CD4bs-specific (KO) decreased with 
the boost (Fig. 1D and table S31). CD4bs- 
specific IgG memory B cell frequencies in 
PBMC samples peaked 2 weeks after the boost, 
reaching median frequencies of ~1 in 300 and 
~1in 200 IgG B cells in the low- and high-dose 
vaccine groups, respectively (Fig. 1C and tables 
$27 and S30). Thus, the germline-targeting 
immunogen induced substantial frequencies 
of CD4bs-specific IgG memory B cells in pe- 
ripheral blood. 

In lymph node and PB samples, eOD-GT8- 
and CD4bs-specific GC B cell or PB frequencies 
were significantly higher in vaccine than pla- 
cebo recipients (Fig. 1, B and C, and tables S23 
and S24), indicating vaccine-induced responses. 
Among vaccine recipients, frequencies of GT8- 
and CD4bs-specific cells were generally higher 
among IgG GC B cells or IgD" PBs than among 
IgG memory B cells in PBMCs (Fig. 1), reflect- 
ing the spatial and/or temporal enrichment 
of vaccine-specific cells among GC B cells and 
PBs. Overall, postimmunization class-switched 
(IgD’) PBMC memory and lymph node GC 
B cells specific for eOD-GT8 or CD4bs were 
predominantly IgG and were enriched for 
IgG compared with preimmunization memory 
(fig. S13 and tables S32 to S34), justifying our 
focus on IgG responses. 


Detection and frequency quantification of 
VRCO1-class IgG B cells 


A total of 11,372 CD4bs-specific BCR sequences 
with paired HCs and LCs were available from 
266 samples to assess vaccine performance 
(fig. S14). For each of these samples, we mea- 
sured the number of VRCO1-class IgG B cells 
and their frequencies among IgG memory 
B cells, GC B cells, or PBs, and we grouped 
the results by time point, sample type, and 
vaccine treatment (Fig. 2, A and B, and tables 
$35 and S36. We then computed the pos- 
itivity rate, defined as the percentage of each 
group with at least one VRCO1-class IgG B 
cell detected, at each time point (Fig. 2C and 
table $37). 

In preimmunization (baseline) samples, we 
detected one or two VRCO01-class IgG memory 
B cells in 6 of 48 participants (12.5%), with a 
median frequency over responders of 2.33 x 
10-*% (one VRCO1-class B cell in 429,000 IgG 
memory B cells) (week —4 in Fig. 2, A and B). 
Thus, preexisting VRCO1-class IgG memory was 
present in at least a minority of participants. 
In postimmunization PBMC samples (weeks 
4, 8, 10, and 16), we detected VRCO1-class IgG 
memory B cells in two placebo recipients 
(2 of 12, 16.7%), both of whom also showed 
preimmunization VRCO1-class IgG memory 
(participants 001 and 080 indicated in Fig. 2). 
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Detection of postplacebo VRCO1-class IgG 
memory B cells in those participants was likely 
due to preexisting VRCO1-class IgG memory. 
In postimmunization PBMC samples, we 
detected VRCO1-class IgG memory B cells in 
significantly higher fractions of vaccine recip- 
ients compared with baseline or placebo re- 
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Fig. 1. Frequencies of antigen-specific and epitope-specific B cells. 

(A) Schedule for immunization and B cell sampling. LN, lymph node. (B) Frequency 
of eOD-GT8-specific (GT8**) IgG memory B cells (left), IgG GC B cells (middle), 
and IgD” PBs (right) shown over time for each participant, grouped by vaccine 
treatment. GT8"* indicates binding to two different eOD-GT8 fluorescent probes. 
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(C) Frequency of CD4bs-sp 
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cipients (Fig. 2C and tables $38 and S39), and 
frequencies of VRCO1-class IgG memory B cells 
among IgG B cells were significantly higher 
after vaccination compared with baseline (Fig. 
2B and table S40). Our predetermined defini- 
tion of a vaccine-induced VRC01-class IgG mem- 
ory B cell response was the detection of one or 
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lls that are CD4bs-specific (KO’), displayed as in (B). 


Each symbol represents the frequency for one participant. Thick lines indicate 
median values, and boxes indicate 25 and 75% quantiles. 


more VRCO1-class IgG memory B cells with a 
frequency higher than baseline for the same 
participant. All postvaccination IgG mem- 
ory samples with VRCO1-class B cells detected 
met that definition (table S36). In week 4 
PBMCs, we detected VRCO1-class memory B cell 
responses in 17 of 18 participants in each 
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Fig. 2. Detection of VRCO1-class IgG B cells in blood and lymph nodes. 

(A) Number of VRCO1-class IgG B cells detected over time in each participant. 
(B) Frequency of VRCOI1-class IgG B cells as a percentage of IgG B cells in 
each participant. Median postvaccination frequencies are stated as l:number of 
IgG B cells. In (A) and (B), symbols represent participants, and the two placebo 
participants with preexisting (week -4) VRCOl-class B cells are indicated as 

a square and a triangle. Thick lines indicate median values, and boxes indicate 


vaccine group [94.4%; 95% confidence inter- 
val (CI), 74.2 to 99.0%] (Fig. 2C and table S37), 
with median frequencies of ~1 in 10,000 and 
~1 in 4500 IgG B cells among positive re- 
sponders in the low- and high-dose groups, 
respectively (Fig. 2B and table S41). These 
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week 4 frequencies of VRCO1-class memory 
B cells in the low- and high-dose groups were 
higher than the median prevaccination fre- 
quency among responders (1 in 429,000) by 
factors of 43 and 94, respectively, and higher 
than the previously reported average frequency 
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25 and 75% quantiles. Medians and quantiles were computed over nonzero 
values only because nonresponders are accounted for in (C). (€) Positivity of 
VRCOl-class IgG B cell detection, defined as the percentage of participants 

in each group with at least one VRCOL-class B cell detected for each time point 
and sample type. Circles indicate median values, and lines indicate 95% Cls 
computed using the Wilson score method. (D) Positivity of VRCO1-class 
responses over all time points or only after the first or second vaccination. 


for naive VRCO1-class precursors [1 in 300,000; 
(21, 43)] by factors of 30 and 67, respectively. 
In week 8 PBMCs, VRCO1-class positivity re- 
mained high, at 83.3% of the low-dose group 
and 94.4% of the high-dose group (Fig. 2C 
and table S37). Median VRCOl1-class IgG 
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memory frequencies among responders de- 
clined modestly from week 4 to week 8 (Fig. 
2B and tables S41 and S42), but week 8 me- 
dian frequencies remained >12-fold above 
the baseline VRCO1-class IgG B cell frequen- 
cy. We concluded that a single vaccination 
consistently induced strong VRCO1-class IgG 
memory B cell responses in the peripheral 
blood. 

After the second vaccination, VRCO1-class 
positivity in week 10 and 16 PBMCs remained 
high (88 to 100%) in both vaccine groups (Fig. 
2C and table S37). Median VRCO1-class IgG 
memory frequencies among responders in 
the low-dose group increased markedly, by a 
factor of 28, to reach a week 10 peak frequency 
of 0.088% (1 in 1139) and finally declined, by a 
factor of 3.3, to a week 16 frequency of 0.027% 
(lin 3764) (Fig. 2B and tables S41 and S42). 
Trends were similar but more favorable in the 
high-dose group, with frequencies at weeks 10 
and 16 of 0.13% (1 in 777) and 0.048% (1 in 
2091), respectively (Fig. 2B and table S41). 
Thus, at the peak response 2 weeks after the 
second vaccination, median frequencies of 
VRCO1-class IgG memory B cells for >88% of 
vaccine recipients in the low- and high-dose 
groups were higher than the median prevac- 
cination frequency by factors of 380 and 550, 
respectively, and higher than the previously 
reported average frequency for VRCO1-class 
naive B cells by factors of 260 and 390, re- 
spectively (21, 43). Six weeks after the peak, at 
week 16, VRCO1-class IgG memory B cell fre- 
quencies remained significantly higher than on 
the day of the second immunization (Fig. 2B 
and table S42). We concluded that the second 
immunization consistently increased VRCO1- 
class IgG B cell frequencies in the peripheral 
blood. 

In lymph node and PB samples, VRCO1-class 
response rates were generally lower than in 
PBMC memory B cells (Fig. 2C). However, among 
positive responders, VRCO1-class frequencies 
among IgG GC B cells and IgD” PBs were gen- 
erally higher than among PBMC IgG memory 
B cells (Fig. 2B), illustrating the strong VRCO1- 
class response in the GC reactions. 

Combining all postimmunization data, we 
detected VRCO1-class IgG B cell responses in 
100% (18 of 18) of participants in the low-dose 
group (median of 44 per participant) and 
94.4% (17 of 18) of participants in the high- 
dose group (median of 91 per participant), 
for an overall response rate of 97% (35 of 36) 
(Fig. 2D and table S37). After the first vac- 
cination only (weeks 3, 4, and 8), we mea- 
sured response rates of 94.4% (17 of 18) in 
both the low- and high-dose groups, giving 
an overall response rate of 94.4% (34 of 36) 
(Fig. 2D and table S37). We concluded that 
the eOD-GT8 60mer vaccine induced VRCO01- 
class IgG B cell responses with high con- 
sistency across vaccine recipients and over 
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time after one or two vaccinations, in both 
dose groups. 


VH1-2 genotype analysis 


One high-dose vaccine participant contributed 
samples from all eight visits but had no de- 
tectable VRCO1-class IgG B cells. Only two of 
the 540 HC sequences from this individual 
(PubID 059) were VH1-2, and both used the 
*06 allele, which suggested that the individ- 
ual might not possess either of the required 
*02 or *04 alleles. To evaluate the VH1-2 allele 
content in that individual and in all other 
participants, we carried out VH1-2 genotype 
analysis using IgDiscover for all 48 participants. 
Across all individuals, we detected five VH1-2 
alleles, including *02 and a novel variant of *02 
with a noncoding polymorphism, *02_S4953, 
both of which we will refer to as *02 here. The 
remaining alleles included *04, *05, and *06. 
Accounting for hetero- and homozygosity, we 
identified a total of nine genotypes, among 
which the most common were *02/*04 (27.1%), 
*02/*02 (22.9%), *04/*04 (20.8%), and *04/*06 
(14.6%) (table S43). Only one of the genotypes, 
*05/*06, included neither of the required 
VRCO1-class alleles, and that genotype was 
only carried by the participant who did not 
produce detectable VRCO1-class IgG B cell 
responses. Thus, we found that vaccine in- 
duction of VRCO1-class responses by eOD-GT8 
60mer is limited by VH1-2 genotype. However, 
individuals lacking at least one of the required 
alleles represented only 2% (1 of 48) of this 
study population, consistent with prior analy- 
ses (13, 41, 44). 


Polyclonality of VRCO1-class responses 


Despite sharing the VH1-2 gene, VRCO1-class 
responses were highly polyclonal, with diverse 
HCDR3s and LCs (fig. S15). BCR sequence 
hierarchical clustering showed that all 2865 
postvaccination VRCO1-class BCRs were rep- 
resented by 1779 independent clusters (line- 
ages) originating from independent germline 
recombination events; thus, >60% of VRCO1- 
class responses derived from distinct precur- 
sors (Fig. 3, A and B). Very few clusters (0.11%) 
were shared between participants (Fig. 3C), 
relatively few clusters (9.7%) were shared over 
time within individuals (Fig. 3C), and most of 
the clusters (82.1 and 80.9% in the low- and 
high-dose groups, respectively) contained a 
single member (Fig. 3D). The depth of sampl- 
ing of the CD4bs-specific BCR repertoire was 
limited by practical constraints, and deeper 
sampling might have identified additional 
clusters or members for each cluster. Never- 
theless, the number of independent VRCO1- 
class clusters per participant was substantial 
(medians of 32 and 65 in the low- and high- 
dose groups, respectively) (Fig. 3E), showing 
that a large number of distinct precursors 
were primed in each individual. High VRCO1- 
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class polyclonality was observed in memory 
B cells at all time points after the prime and 
boost and in PBs, whereas in GCs, we ob- 
served reduced levels of polyclonality concom- 
itant with detection of larger clonal families 
(Fig. 3F). Thus, eCOD-GT8 60mer primed re- 
sponses from a diverse pool of VRCO1-class 
precursors, and the clonal diversity was main- 
tained after the boost. 


Competitor responses: Non-VRCO1-class 
CD4bs-specific responses 


VRCO1-class responses were a minority of 
CD4bs-specific IgG B cell responses. The 
VRC01-class fraction of CD4bs-specific IgG 
BCRs had median per dose group values of 16 
to 28% in PBMC memory B cells, and 0 to 36% 
in lymph node GC B cells and PBMC PBs, across 
both vaccine groups and over time (Fig. 4A). 
CD4bs-specific VRCO1-class responses were an 
even smaller fraction of total eOD-GT8-specific 
IgG B cells, with median per dose group values 
of 3.5 to 8% in memory B cells (Fig. 4B). These 
values were substantially higher than results 
from preclinical experiments with adjuvanted 
eOD-GTS8 60mer in Kymab mice (46) and VH1- 
2 recombining mice (27, 54), in which only 1% 
and 0.3 to 3.2% of CD4bs-specific IgG BCRs 
were VRCO1-class, respectively, and were sim- 
ilar to results from naive human B cell sorting 
with eOD-GTS tetramers, in which 15 to 20% of 
CD4bs-specific naive BCRs were VRCO1-class 
(21, 43, 44). Here, CD4bs-specific, non-VRCO1- 
class IgG BCRs, defined as any CD4bs-specific 
BCR not meeting the VRCO1-class definition, 
included non-VH1-2 BCRs as well as VH1-2 
BCRs with LCDR3 lengths other than five. 
However, VRCO1-class BCRs made up a dom- 
inant 83% of VH1-2/kappa BCRs and 43% of 
VH1-2/lambda BCRs, owing to strong enrich- 
ment for five-amino acid LCDR3s (fig. S16). 
CD4bs-specific, non-VRCO1-class IgG B cells 
were highly polyclonal (fig. S17), with diverse 
gene usage and CDR3 lengths (fig. S18), just 
as VRCO1-class were highly polyclonal (Fig. 3). 
However, non-VRC01-class B cells reached 
higher frequencies than VRCO1-class B cells 
among all IgG B cells. Median frequencies of 
CD4bs-specific, non-VRCO1-class IgG B cells 
peaked at week 10 values of 0.27 and 0.39% 
in the low- and high-dose vaccine groups, re- 
spectively (fig. S19 and table S44), compared 
with VRCO1-class frequencies of 0.09 and 0.13%, 
respectively. Thus, VRCO1-class B cells were 
induced despite a dominant competing CD4bs 
response by a highly diverse pool of non-VRCO1- 
class B cells. 


BCR mutation levels 


Changes in vaccine-induced BCR mutation 
levels over time can provide insight into the 
immune processes underlying the response. 
To assess changes in CD4bs-specific BCR 
mutation levels, we computed the median 
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Fig. 3. VRCO1-class BCR hierarchical clustering and genetic diversity. 

(A) Number of clusters versus number of BCR sequences, for all postvaccination VRCO1- 
class BCR sequences in both vaccine groups. The number of sequences per cluster is 
indicated at the bottom. (B) Pairwise distance distributions for BCR sequences in 

(A), including all versus all, intracluster (all versus all within cluster), and intercentroid 
(between cluster centroids). (©) Number of clusters involving single donor and time point, 
single donor and multiple time points, and multiple donors, 


percent mutation per participant per time point 
for HC V genes (V;;) and LC kappa and lambda 
V genes (Vx/V,), in VRCO1-class and non- 
VRCO1-class BCRs for both nucleotide and 
amino acid mutations (Fig. 5, A and B; fig. S20; 


add | 


wk wkS wktO wkil wk 16 
MBC PB MBC OG MBC 


Weeks Post Vaccination 


and high-dose groups. (D) Histogram of cluster size for the low- and high-dose groups. 
(E) Number of clusters versus number of VRCO1-class BCR sequences for each 
participant in the low- and high-dose groups. Dashed lines indicate equality in numbers of 
clusters and sequences. (F) VRCOl-class BCR polyclonality over time. Each symbol 
reports the fraction of BCR sequences that cluster as unique clones within a single donor 
at a single time point. Thick lines indicate median values, boxes indicate 25 and 75% 


separately for the low- quantiles, and whiskers approximate 10 and 90% quantiles. MBC, memory B cell. 


and table S45). We made statistical compar- 
isons between time points using paired data 
in which mutation levels were available from 
the same individual at two time points (tables 
S46 to S48). We also monitored the mutation 
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distributions across all antibodies in each group 
and time point using violin plots (Fig. 5, C 
and D). In the narrative below, we provide 
example Vy, amino acid mutation values as 
indicators for change. 
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Fig. 4. Frequency of VRCO1-class B cells among CD4bs- or eOD-GT8-specific IgG B cells and PBs. (A) VRCO1-class frequencies among CD4bs-specific B cells. 
(B) VRCO1-class frequencies among eOD-GT8-specific B cells. Each symbol represents the frequency for one participant. The two placebo participants with 
preexisting (week -4) VRCOl-class B cells are indicated as a square and a triangle. Thick lines indicate median values, and boxes indicate 25 and 75% quantiles. 


After the first immunization, nucleotide and 
amino acid mutation levels in VRCO1-class 
memory IgG Vy and Vx,z, genes generally 
increased significantly from week 4 to week 8 
in both vaccine groups, with the only excep- 
tion being Vj; amino acid mutation in the low- 
dose group, which showed an increase that 
barely missed significance (Fig. 5, A and C; fig. 
$20; and tables S46 and S47). Thus, GCs re- 
mained active and produced memory B cells 
with increased somatic hypermutation (SHM) 
beyond week 4. By week 8, Vy amino acid 
mutation levels reached median values of 1.5% 
in both dose groups (table S45). 

After the second immunization, nucleotide 
and amino acid mutation levels in VRCO1-class 
memory BCR Vj, genes from both vaccine 
groups increased significantly from week 8 
to week 10 but did not change significantly 
from week 10 to week 16 (Fig. 5, A and C; 
fig. S20; and tables S46 and S47). Week 10 
Vy amino acid mutation levels in VRCO1-class 
IgG memory BCRs reached median values 
of 3.0% in both vaccine groups (table $45). 
Thus, the week 8 boost caused a relatively 
rapid increase in SHM within the VRCO1- 
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class IgG memory pool. Increased mutation 
levels also appeared in PBs 4 to 8 days after 
the boost (Fig. 5, A and C; fig. S20; and tables 
S46 and S47). GCs remained active at week 
11, as reflected by the relatively high median 
Vy amino acid mutation levels of 5.1 and 6.7% 
in week 11 GC BCRs in the low- and high-dose 
groups, respectively (table S45). The increased 
SHM in GCs indicated an ongoing physiological 
response to the boost in which B cells continued 
acquiring mutations and a fraction of those 
cells likely exited to blood. Our finding that an 
autologous boost immunization increased mu- 
tation levels in VRCO1-class GC and memory B 
cells provides support for a key assumption 
underlying the germline-targeting vaccine 
design strategy, namely that sequential vac- 
cination can increase the maturation of tar- 
geted B cell classes in humans. 

Comparing GC BCR mutation levels at 
weeks 3 and 11 provides insight into mecha- 
nism. In GC BCRs, the distributions of Vy 
and Vx/V;, mutation levels computed over all 
VRCO1-class responders were substantially 
higher at week 11 than week 3 (Fig. 5, A and C; 
fig. S20; and table S45). For example, median 
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(interquartile range) Vy amino acid muta- 
tion levels were 1.0% (1.0 to 1.8%) and 1.3% 
(1.0 to 2.0%) at week 3 in the low- and high- 
dose groups, respectively, compared with 5.1% 
(4.6 to 5.6%) and 6.7% (5.0 to 7.1%) at week 11 
(Fig. 5A; fig. S20; and table S45), and the 
week 11 violin plot distributions over all BCRs 
show little similarity to the lower SHM week 3 
violin plots (Fig. 5C). With paired data from 
weeks 3 and 11 available for only a few indi- 
viduals (NV = 1 at low dose and N = 5 at high 
dose; fig. S21), we could not ascribe signifi- 
cance to the higher mutation levels at week 11 
(tables S46 and S47). Nevertheless, the above 
comparisons between weeks 3 and 11, and the 
minimal overlap in mutation distributions for 
four individuals with multiple data points at 
both weeks 3 and 11 (fig. S21A), suggested that 
the second immunization did not cause sub- 
stantial priming of naive B cells but instead 
primarily induced additional maturation of 
VRCO01-class GC and/or memory B cells gen- 
erated by the first immunization. 

SHM in non-VRCO01-class IgG memory BCRs 
increased significantly from week 4 to week 8 
for the high-dose group only and showed no 
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Fig. 5. Amino acid mutation levels in HC and LC V genes over time, 

for VRCO1-class and non-VRCO1-class BCRs from vaccine recipients. 

(A and B) VRCO1-class (A) and non-VRCO1-class (B) BCR Vy percentage of 
amino acid (aa) mutations (top) and Vx/_ percentage of amino acid mutations 
(bottom) for the low-dose (left) and high-dose (right) group, with symbols 
representing the median per participant per time point. (© and D) VRCO1- 
class (C) and non-VRCOl1-class (D) BCR Vy percentage of amino acid 
mutations (top) and Vx,_ percentage of amino acid mutations (bottom) for 


the distribution of all BCRs per group per time point. In (A) and (B), thick 
lines are medians and box plots show 25 and 75% quantiles for each 

dose group at each time point. Statistical significance is indicated for all 
comparisons with pairs of measurements from at least eight participants 
(tables S46 and S48). Significance testing was done using Wilcoxon 
signed-rank test for paired data (two-sided, a = 0.05); ns, not significant; 
*P < 0.05; **P < 0.01; ***P < 0.001. In (C) and (D), solid lines indicate 
medians and dashed lines show 25 and 75% quantiles for each dose group 


the low-dose (left) and high-dose (right) group, with 


significant increase from week 8 to week 10 or 
16 in either group (Fig. 5, B and D; fig. S20; 
and tables $48 and S49). Thus, the germline- 
targeting vaccine boost succeeded to increase 
memory IgG BCR median mutation levels for 
the targeted class of B cells without causing 
similar increases in undesired BCRs competing 
for the same epitope. Non-VRCO1-class BCRs 
did show significant mutational increases be- 
tween memory B cells at week 8 and either PBs 
at week 9 or GC BCRs at week 11 (Fig. 5, B and 
D; fig. S20; and tables S48 and S49). SHM in 
non-VRCO1-class GC BCRs was significantly 
higher at week 11 than at week 3 for the high- 
dose group and nearly significantly higher 
for the low-dose group (Fig. 5, B and D; fig. 
S21B; and tables S48 and S49), indicating that 
the second immunization primarily activated 


previously mutated GC and/or memory B cells 
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violin plots representing at each time point. 


instead of naive B cells, similar to our obser- 
vation with VRCO1-class responses. 


Precursor origins 


We detected prevaccination VRCO1-class IgG 
memory B cells in only 12.5% (6 of 48) of par- 
ticipants, and 71% (5 of 7) of those BCRs had 
mutation levels substantially above the levels 
detected postvaccination at weeks 3 or 4 (tables 
$45 and S50), indicating that the VRCO1-class 
IgG B cells detected postvaccination predomi- 
nantly originated from naive VRCO1-class 
B cells rather than IgG memory B cells. How- 
ever, clustering analysis indicated that one 
week -—4 VRCO1-class IgG memory BCR was 
potentially clonally related to two postvacci- 
nation BCRs in the same individual (fig. S22), 
providing evidence that at least some VRCO1- 
class IgG BCRs detected postvaccination may 
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have originated from VRCO1-class IgG mem- 
ory B cells. 

Non-VRC01-class IgG BCRs had significantly 
higher mutation levels at week —4 compared 
with weeks 3 or 4 (Fig. 5, B and D; fig. S20; and 
tables S45, S48, and S49), indicating that most 
non-VRCO1-class IgG B cells detected post- 
vaccination probably also originated from 
naive rather than memory B cells. However, 
sequence clustering analysis indicated that, 
among 26 non-VRCO01-class memory IgG BCR 
lineages with members from both pre- and 
postvaccination time points, evidence allow- 
ing for a potential preexisting memory B cell 
precursor could be found in 11 (42%) of the 
lineages (fig. S23). Non-VRCOl1-class BCR 
mutation levels at weeks 3 to 8 were substan- 
tially higher than those for VRCO1-class (Fig. 5 
and table S45), which might have been due to 


8 of 28 


RESEARCH | RESEARCH ARTICLE 


A Non-VRCO1 Class B 
- VRCO1 Class ey Et Control a VRCO1 Class a 
a 100 BD K1-33 ppa m 
5 60 B K3-20 
8 F B Ki5 
> 60 @ K3-15 bnAbs 
pa G@ 12-14 
$2 - B 2-23 
B i211 
g 20 O Other 
9 20 
# 20 pg 2Dyg 100pg 100g Wg 2Wpg 100p9 100pg  — DeKosky 9 
5 Prime Boost Prime Boost Prime Boost Prime Boost VH1-2 
Non-VRCO1 Class 
VRCO1 Class with 5aa Ete Control 
to 104 @ @ a 
ip ist 
B 60 
A -y 
ae + 2 
38 an % ° ° "Gre binders 
#- 0 é oo 
20 2049 100y9 100 2 20g 1009 100pg OAS Saal3 6 4M 8 8 7 oS MM 8 O86 oF 
E Prine Boost Primes pode g Prime Boost on Pine. Boost” Light Chain Position 
Non-VRCO1 Class 
3 VRCST Class with 5aa LCDR3 Conia Non-VRCO1-class 5aa LCDR3s 
3 100 © Ick Kappa Lambda 
2 A IGL 
8 80 
- 60 VRCO1-class 
40 
Bes 
2 oO 
# 2049 2g 100u9 100" Aug Ay 100u9 Wong OAS SaaL3 20 fag 
F GK «IGL (OIGKOKGL GK «IGL)sOGK 
Non-VRCO1 Class 
VRCO1 Class VH1-2 Control 
2 100 ° 
£9 340 
pz 
ei 40 
# g e fe ° 
0 
2g 20yg 100y9 100u9 20 19 20149 100,u9 100 ug OAS VH1-2 3S 6 6 8 8 MM OS OG OF 
G Prime Boost Prime Boost Light Chain Position 
20 pug Dose 100 pig Dose Control 
i 4 ° 
eB ° = s — * = =~ 12 
7 24a ° 
se | ° rales 
Bs ° ° ° ° 
Es ° © 
; om 
weS wkd WKS WKS wk 10 ow TT wk 16 wed wkd wk owkO) | wk1O)owk11) wk 16 
cc 42NBC ) 6OUMBC) OUPBOUMBC MBC cc 4NBC «6MBC)| 6OPB)COMBCOCGCtséUMIBT 
Weeks Post Vaccination Weeks Post Vaccination 


Fig. 6. Properties of postvaccination BCRs shared with VRCO1-class 
bnAbs. (A) Percentage of BCRs using VRCOl-class bnAb Vx,, genes, for 
VRCOl-class and VH1-2-using non-VRCO1-class BCRs, and for control VH1-2 
BCRs from HIV-unexposed individuals from DeKosky et al. (98). VRCO1-class 
bnAb Vx,_ are indicated in the color key. (B) Sequence logos for five-amino acid 
LCDR3s from VRCOl-class BCRs for bnAbs (top row), low-dose (second row), 
high-dose (third row) groups, and human naive precursors from prior studies 
(39, 43, 44) (bottom row), distinguishing kappa (left) and lambda (right) 
LCs. (C) Sequence logos for five-amino acid LCDR3s from non-VRCO1-class 
BCRs from the low-dose (second row) and high-dose (third row) groups, and 
control data human LCs from HIV-unexposed individuals from the Observed 
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Antibody Space (OAS) (91, 92) (bottom row), with VRCOl1-class bnAbs (top row) 
shown for reference, distinguishing kappa (left) and lambda (right) LCs. 

(D) Percentage of BCRs using Glu or Gln at LC position 96, for five-amino acid 
LCDR3s from VRCO1-class BCRs, non-VRCO1-class BCRs, and OAS control 
data LCs (91, 92). (E) Percentage of BCRs with LCDR3 matching a VRCO1-class 
bnAb sequence, for five-amino acid LCDR3s from VRCO1-class BCRs, non- 
VRCOl1-class BCRs, and OAS control data LCs (91, 92), distinguishing IGK 

and IGL LCs. (F) Percentage of BCRs with Trpios-3, for VRCO1-class and VH1-2- 
using non-VRCOl-class BCRs and for OAS (91, 92) control data VH1-2 HCs. 
(G) Number of key VRCOl-class residues in VRCO1-class HCs for all time points 
in the low-dose (left) and high-dose (middle) groups, with symbols indicating 
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the 90% quantile per participant per time point, and for VRCO1-class bnAbs 
(right, different y-axis scale), with symbols denoting bnAbs. 20 ug, low dose; 
100 ug, high dose; prime, data from weeks 3, 4, and 8; boost, data from weeks 9, 


non-VRC01-class responses deriving in part 
from already mutated memory B cells, but also 
might have been due to low-affinity, naive 
precursor-derived, non-VRCO01-class BCRs gain- 
ing SHM more rapidly than high-affinity 
VRCO01-class BCRs after the first immunization. 


BnAb characteristics shared by 
vaccine-induced VRCO1-class IgG BCRs 


Vaccine-induced VRCO1-class BCRs shared 
other characteristic features of VRCO1-class 
bnAbs in addition to the VH1-2 alleles and 
a five-amino acid LCDR3. VRCO1-class bnAbs 
use a subset of human Vx or Vy, genes (data SI), 
and more than 90% of vaccine-induced VRCO1- 
class IgG BCRs used known bnAb Vx or V;, 
genes, in both groups after the prime or boost 
(Fig. 6A and fig. S24A). The LCDR3 is an im- 
portant site of affinity selection in VRCO1-class 
bnAbs, with all known bnAbs possessing Glu 
or Gln at position 96, and most bnAbs ex- 
hibiting LCDR3 sequence motifs with sub- 
stantially reduced diversity at several positions 
compared with naive VRCO1-class precursors 
(Fig. 6B and data S1) (21, 39, 41, 43, 44, 46, 55). 
LCDR3s from vaccine-induced VRCO1-class 
BCRs showed signs of selection toward bnAb 
sequences, especially at position 96 (Fig. 6, B 
to D, and figs. S24B and $25). For VRCO1-class 
BCRs with kappa LCs, the median fraction of 
LCDR3s perfectly matching a bnAb LCDR3 
was 39 or 35% in the low- or high-dose groups, 
respectively, whereas lambda chain BCRs 
had no bnAb LCDR3s (Fig. 6E and fig. $24, C 
and D). Among non-VRC01-class BCRs, a small 
fraction (1.8%) had five-amino acid LCDR3s, 
many of which had high similarity to VRCO1- 
class bnAb LCDR3s, whereas control sequences 
did not (Fig. 6, C to E). This suggested the 
possibility that eCOD-GT8 60mer vaccination 
might have selected for BCRs using a VRCO1- 
class binding mode with VH genes other than 
VH1-2, which will be investigated further. 
Most VRCO1-class bnAbs with K3-20 LCs have 
deletions in LCDR1 important for accommo- 
dating the N276 glycan conserved on the HIV 
spike (42, 55, 56). We observed K3-20 LCDR1 
deletions at rates of 5 to 10% in both VRCO1- 
class and non-VRC01-class BCRs from only a 
few participants each (fig. S26), indicating that 
such deletions were not common and were 
not specifically selected by eOD-GT8, which 
was not surprising because eOD-GT8 lacks the 
N276 glycan. VRCO1-class bnAbs exhibit a wide 
range of HCDR3 lengths, from 12 to 18, and 
nearly all encode a Trp five residues before 
the end of the HCDR3, a position that we and 
others have previously inaccurately referred to 
as “100,” that we here term “103-5” to count 
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backward from position 103 at the end of the 
HCDR3 (data S1). Trpio3-5 was previously found 
in ~31% of eOD-GT8-specific human naive 
VRCO1-class BCRs (21, 43, 44) (fig. S27). HCDR3 
lengths for vaccine-induced VRCO1-class BCRs 
spanned those of VRCO1-class bnAbs (fig. S15C), 
and the percentage of VRCO1-class BCR HCDR3s 
with Trpj93.5 had median values over all par- 
ticipants of >4'7% in both dose groups and after 
the prime or boost (Fig. 6F and figs. S24E and 
$27), suggesting enrichment of Trpjo3.5 due to 
vaccination. Accounting for combinations of 
bnAb features, we found that the median frac- 
tion of VRC01-class BCRs with four of the above 
described bnAb features was >25% in both dose 
groups and after the prime or boost (fig. S28). 
Finally, we considered the acquisition of key 
amino acid mutations in the HC, an essential 
aspect of vaccination to induce bnAbs. From a 
representative set of 19 potent VRCO1-class 
bnAbs that included all known VRCO1-class Vx 
and V, genes but also minimized the inclusion 
of bnAbs with insertions or deletions, we iden- 
tified a set of 20 positions (19 within VH1-2, 
plus Trpjo3-5) at which key VRCO1-class residues 
are observed, four of which are germline- 
encoded in the VH1-2*02 and *04 alleles com- 
patible with a VRCO1-class antibody (data S1). 
We counted these key VRCO1-class residues on 
a scale ranging from —4 to +16 to allow for all 
possibilities from losing all germline-encoded 
key residues to gaining key residues at all 16 
positions not containing a germline-encoded 
key residue. The representative bnAbs had a 
median of 13 key residues and a range of +8 
to +16 (Fig. 6G). We computed 90th percen- 
tile values among VRCOI-class BCRs in each 
study participant as representative for the 
best 20% of BCRs in that individual. The me- 
dian of 90th percentile values for key HC 
residues was >+2 at nearly all time points at 
week 8 or later in both the low- and high-dose 
groups (Fig. 6G). Thus, vaccination selected 
for the acquisition of important HC residues 
in VRCO1-class BCRs, suggesting that they 
could be guided toward bnAb activity with 
further boosting. 


BCR affinity dynamics 


BCR-antigen affinity influences B cell fate 
throughout an immune response. To under- 
stand how VRCO1-class and non-VRC01-class 
BCR affinities for eOD-GT8 evolved over time, 
we carried out surface plasmon resonance 
(SPR) analyses of the interactions between 
eOD-GT8 monomer and recombinant IgGs 
corresponding to BCRs from the low-dose vac- 
cine group, including postvaccination BCRs 
and their inferred-germline (iGL) variants rep- 


2 December 2022 


10, 11, and 16. In (D) to (G), symbols represent individual participants [except the 
bnAb controls in (G)]; thick lines indicate median values, boxes show 25 and 
75% quantiles, and whiskers approximate 10 and 90% quantiles. 


resenting naive precursors (Fig. 7, fig. S29, and 
data S2). 

VRC01-class iGLs had surprisingly high af- 
finities, with a median Kp of 119 nM, ~45-fold 
higher affinity than the median Kp of 5.3 uM 
for eOD-GT8-specific human naive VRCO1- 
class precursors (21, 43) (Fig. 7). The high 
VRCO01-class iGL affinities were not likely due 
to the presence of affinity-enhancing muta- 
tions at CDR3 junctions, because iGL affin- 
ities were not higher for parental BCRs with 
higher SHM levels (fig. S30). Nor were the 
high VRC01-class iGL affinities due to bias for 
high affinity in the B cell sorting, because sub- 
stantially lower-affinity non-VRCO1-class BCRs 
were recovered (Fig. 7). VRCO1-class week —4 
memory IgG BCRs, which in most cases were 
not likely to have served as precursors (dis- 
cussed above), had moderate affinities, with 
a median Kp of 5.9 uM, in a similar range as 
naive VRCOl1-class BCRs (Fig. 7). Evidently, 
the vaccine-induced VRC01-class IgG GC and 
memory B cells that survived GC competition 
originated predominantly from precursors 
with the very highest affinities (subset with 
median Kp of 119 nM) among the naive VRCO1- 
class precursors. 

By contrast, only 43% of non-VRCO1-class 
iGLs had detectable binding (Kp < 100 uM), 
and the median Kp was the limit of detection 
of our SPR assay (2100 uM), ~840-fold lower 
in affinity than for VRCO1-class iGLs (Fig. 7). 
Non-VRCO01-class week -—4 memory IgG BCRs 
also had low affinities, with a median Kp of 
16.0 uM, similar in magnitude to the median 
Kp of 22.8 uM for non-VRCO1-class human 
naive precursors (Fig. 7). The capacity for low- 
affinity non-VRCO1-class precursors to com- 
pete effectively for the CD4bs-specific response 
against higher-affinity VRCO1-class precursors 
was likely due to the higher non-VRC01-class 
BCR precursor frequency: Among eOD-GT8 
CD4bs-binding naive BCRs isolated using 
high-avidity probes to enhance recovery of low- 
affinity clones, non-VRCO1-class are more com- 
mon than VRCO1-class by a factor of ~170 (44). 

After the first vaccination, VRCO1-class affin- 
ities increased by an average factor of 4.8 over 
iGLs, to median Kps of 12, 31, and 31nM in 
week 3 GC BCRs and week 4 and 8 memory 
BCRs, respectively (Fig. 7 and fig. S31). Non- 
VRCO1-class median affinities increased by a 
much larger factor of >833 over iGLs, reach- 
ing amedian Kp of 120 nM in week 3 GC BCRs, 
but then declined 40-fold to median Kps of 3.7 
and 6.0 uM in week 4 and 8 memory BCRs, 
respectively (Fig. 7 and fig. S31). 

After the second vaccination, affinities for 
VRCO1-class memory BCRs increased by a 
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Fig. 7. SPR analysis of BCR 
affinities for eOD-GT8. Mono- 
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factor of 12, from a median Kp of 31 nM at 
week 8 to median Kps of 3 and 2 nM at weeks 10 
and 16, respectively (Fig. 7). Similarly high 
affinities were found in PBs at 4 to 8 days after 
the boost (median Kp of 1 nM). The step-like 
jump to higher affinities for VRCO1-class BCRs 
in the periphery at weeks 9, 10, and 16 tracked 
with the step-like increase in mutation levels 
(Fig. 5, A and C). VRCO1-class GC BCRs at 
week 11 had lower median affinity (Kp of 26 nM), 
but firm conclusions could not be drawn from 
that observation because the data were ob- 
tained for only three participants (Fig. 7). Non- 
VRCO1-class memory affinities increased by 
an average factor of 4.4 after the boost, from 
a median Kp of 6.0 uM at week 8 to median 
Kps of 650 and 2100 nM at weeks 10 and 16, 
respectively (Fig. 7). Non-VRCO1-class PBs 
(median Kp of 7 nM) and week 11 GC BCRs 
(median Kp of 0.7 nM) showed substantially 
higher affinities than memory BCRs at weeks 8, 
10, or 16 (e.g., 3000-fold higher at week 11 than 
week 16), demonstrating that strong and rapid 
non-VRCO01-class affinity maturation occurred 
in response to the week 8 boost, but the re- 
sulting high-affinity BCRs either remained 
within GCs or populated the plasma compart- 
ment rather than the memory compartment. 
Thus, affinity maturation of the memory pool 
in response to the boost was more efficient for 
VRCO01-class than non-VRC01-class B cells. 
Overall, VRCO1-class precursor BCRs started 
with a massive affinity advantage over CD4bs- 
specific, non-VRCO1-class precursor BCRs (ratio 
of median Kps for iGLs =840); this advan- 
tage declined but remained high after the 
first vaccination (ratio of median Kps at week 8 
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equal to 194) and then increased again after the 
boost (ratio of median Kps at week 16 equal to 
1050). VRCO1-class affinity gains were associated 
with both increased on rates and decreased off 
rates, but off-rate reduction was the larger ef- 
fect (figs. S32 and S33). Thus, although B cell 
selection in the GC might occur under high 
avidity conditions (i.e., an array of BCRs on the 
surface of a B cell interacting with an array of 
eOD-GT8 60mer antigens on a follicular den- 
dritic cell), the process nevertheless selected 
BCRs with improvements in monovalent Kp as 
well as on and off rate, perhaps by follicular 
dendritic cells regulating antigen availability 
(57). VRCO1-class affinities and on rates both 
increased significantly with SHM, and off rates 
decreased significantly with SHM, across all 
postvaccination BCRs (P values for Kp, on-rate 
(Kon), and off-rate (Aog;), respectively, are <0.001, 
0.0003, and <0.0001) and in memory B cells 
at each time point (figs. S34 to S36), indicat- 
ing that SHM contributed to maintaining the 
VRC01-class affinity advantage over time. 


Guiding SHM 


A key requirement for germline-targeting 
priming immunogens is the induction of bnAb- 
precursor-derived GC and/or memory B cells 
capable of binding antigens that are more 
similar to the native viral glycoprotein (native- 
like) than the priming immunogen within the 
target epitope (13, 39). This is necessary so 
that a more native-like immunogen can serve 
as a boost to advance B cell maturation fur- 
ther toward bnAb development. eOD-GTS8 was 
designed to have an “affinity gradient,” with 
stronger binding to bnAbs than to bnAb pre- 
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cursors, based on the hypothesis that if bnAb 
precursors could be primed, the affinity gra- 
dient would guide early SHM toward bnAb 
development and concomitantly induce bnAb- 
precursor-derived responses that bind to more 
native-like antigens (73, 21, 39). As a result, 
eOD-GTS8 possesses a strong affinity gradient 
for VRCO1-class BCRs, with ~230-fold higher 
affinity for bnAbs than for naive precursors 
(21, 43). To determine if eOD-GT8 60mer 
immunization selected for VRCO1-class BCRs 
that bind more native-like immunogens, we 
first tested binding of =210 VRCO1-class mem- 
ory BCRs from the low dose group from weeks 4, 
8, 10, and 16 to a native-like trimer [BG505 
MD39 (22)] and a core-gp120 lacking the N276 
glycan [core-e-2CC HxB2 N276D (39)]. We 
detected no trimer binding even at the highest 
trimer analyte concentrations tested (11 uM), 
even though the trimer analyte allowed for 
avidity in binding, but ~10% of the antibodies 
tested from weeks 10 and 16 showed weak 
binding to core-gp120 (fig. S37). We then tested 
binding of VRCO1-class iGL and postvaccina- 
tion low-dose-group BCRs to eOD-GT6 (13) 
and four variants of eOD-GT6, each of which 
had a more native-like CD4bs compared with 
that of eEOD-GTS (fig. S38). Previous work dem- 
onstrated that eOD-GT6 had no detectable 
affinity for the vast majority of human naive 
VRCO1-class precursors that bind eOD-GT8, 
which was explained by the fact that COD-GT6 
lacks several germline-targeting mutations that 
are present in eOD-GTS8 (27). Consistent with 
that prior finding, we found here that eOD- 
GT6 and its four variants had very limited re- 
activity to VRCO1-class iGL precursors. Whereas 
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eOD-GT8 bound to 97% of iGLs with a median 
Kp of 119 nM (Fig. 7), eEOD-GT6 and its variants 
bound to only 2 to 39% of the iGLs and had 
median Kps of =100 uM (Fig. 8A). However, 
postvaccination VRCO01-class antibodies showed 
improved binding for eEOD-GT6 and its variants, 
especially after the boost. For example, eOD- 
GT6 bound to 78% of BCRs tested at week 16, 
with an overall median Kp of 6 uM (Fig. 8A), 
and the four more-native-like variants of eOD- 
GT6 (GT6v2, GT6v3, GT6v4, and GT6-N276+ 
with an intact N276 glycosylation site) bound 
to 75, 29, 24, and 30% of BCRs tested at week 16, 
respectively, in comparison to binding 39, 17, 
2, and 5% of iGLs, respectively (Fig. 8A). Thus, 
vaccination with eOD-GT8 60mer not only in- 
duced VRCO1-class responses, but also selected 
for mutations in VRCO1-class BCRs that con- 
ferred affinity for antigens with more-native-like 
CD4bs epitopes. Postvaccination non-VRCO1- 
class antibodies showed substantially weaker 
binding to eOD-GT6 and variants (fig. S39) 
compared with VRCO01-class antibodies (Fig. 8A), 
demonstrating that eOD-GT8 60mer vacci- 
nation also minimized the induction of com- 
peting responses capable of binding more 
native-like CD4bs epitopes. Finally, VRCO1-class 
affinities for eOD-GT6 and all but one variant 


Fig. 8. SPR analysis of VRCO1- 
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increased with affinities to EOD-GTS8 (P values 
for eOD-GT6, GT6v2, GT6v3, GT6v4, and 
GT6-N276+, respectively, are <0.0001, <0.0001, 
0.0002, 0.3809, and 0.008; Fig. 8B). Therefore, 
higher affinity for eOD-GTS8 translated into 
higher affinity for more native-like antigens, 
providing support for the hypothesis that engi- 
neering an affinity gradient into a germline- 
targeting priming immunogen can help guide 
SHM selected by that immunogen (13). Our 
data also support the applicability of that hy- 
pothesis to boost immunogens, as previously 
proposed (26). 


Discussion 


Learning how to induce broadly neutralizing 
antibodies against pathogens with high anti- 
genic diversity, such as HIV, influenza, hepa- 
titis C virus, or the family of betacoronaviruses, 
represents a grand challenge for rational vac- 
cine design. Germline-targeting vaccine design 
offers one potential strategy to meet this chal- 
lenge. The strategy is predicated on the de- 
sign of priming immunogens that consistently 
induce responses from rare bnAb-precursor 
B cells with predefined BCR features, select 
for at least a modicum of productive BCR 
maturation, and generate a pool of GC and/or 
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memory B cells likely to be susceptible to boost- 
ing by more native-like immunogens. Here, 
in the first test of this strategy in humans, 
we found that the eOD-GT8 60mer/ASO1, 
germline-targeting vaccine prime had an ac- 
ceptable safety profile and induced the tar- 
geted VRCO1-class IgG B cells with substantial 
frequencies in blood and lymph nodes con- 
sistently across vaccine recipients. We further 
demonstrated that the vaccine selected for 
bnAb-like BCR properties and favorable matu- 
ration and consequently generated B cells with 
capacity to bind less-engineered and more- 
native-like forms of the CD4bs epitope. These 
findings establish clinical proof of concept for 
germline-targeting priming, support develop- 
ing boost regimens to induce VRCO1-class 
bnAbs, and encourage extending the strategy 
to other targets in HIV and other pathogens. 

Developing germline-targeting priming 
immunogens for other classes of antibodies 
will be more challenging owing to the HCDR3 
dominance of most antibodies that is not ob- 
served for VRCO1-class bnAbs, but a generalized 
method for designing such immunogens has 
been described (25). Deploying the germline- 
targeting strategy effectively for other antibody 
classes will require knowledge of population 
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2100 uM for eOD-GT6 and variants are shown in gray. S, slope of regression line; P, P value for slope; ns, not significant (P > 0.05). 
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frequencies of any required gene alleles, anal- 
ogous to the VH1-2 *02 and *04 alleles here, as 
well as frequencies of recombination events 
that produce potential bnAb precursors, anal- 
ogous to the production of five-amino acid 
LCDR3s here. 

Our BCR sequence hierarchical clustering 
analysis showed that VRCO1-class responses 
derived from many independent recombi- 
nation events in each participant. Thus, the 
ability to prime VRCO1-class responses con- 
sistently across vaccine recipients was due 
in part to the immunogen having affinity for 
diverse VRCO1-class precursors. Explicit engi- 
neering of priming immunogens with affinity 
for diverse precursors is one of the hallmarks 
of germline-targeting vaccine design that dis- 
tinguishes this approach from others. 

The robust serological and B cell immuno- 
genicity observed here, especially the sub- 
stantial frequencies of eOD-GT8- and CD4bs 
epitope-specific IgG memory B cells in PBMCs 
after only one or two vaccinations, was likely 
due largely to the combination of a high- 
valency, glycosylated nanoparticle immuno- 
gen with a strong adjuvant (13, 39, 47, 58). The 
strong immune responses and acceptable re- 
actogenicity support the use of adjuvanted self- 
assembling nanoparticle vaccines in humans. 

A major challenge for priming and matura- 
tion of responses from rare bnAb-precursor 
B cells is the competition from higher-frequency 
non-bnAb B cells that can engage the same 
epitope (25-27, 39, 45-49). Here, VRCO1-class 
responses remained in the minority of CD4bs- 
directed responses at all time points after prime 
and boost but nevertheless maintained high 
positivity and exhibited favorable maturation. 
Our BCR sequence and affinity analyses pro- 
vide insights into how that was achieved. The 
affinities required for priming of rare bnAb 
precursors were relatively high (81% were 
better than 3 uM and 50% were better than 
119 nM) and orders of magnitude higher than 
for competitors, and the responses were derived 
predominantly from naive B cells. Furthermore, 
the germline-targeting priming immunogen 
was able to (i) select B cells encoding bnAb- 
like properties beyond the properties spe- 
cifically targeted; (ii) stimulate previously 
matured GC and/or memory B cells by a 
booster immunization, leading to increased 
SHM and affinity of bnAb-precursors in the 
memory pool, with weaker increases for com- 
petitors; (iii) maintain a large affinity advan- 
tage for bnAb-precursors over competitors; 
and (iv) guide affinity maturation in bnAb- 
precursors toward bnAb development, likely 
by presenting an affinity gradient. Success- 
ful priming and boosting of bnAb-precursor 
B cells with such a multifaceted set of desir- 
able outcomes, even while the targeted cells 
remained in the minority of epitope-specific 
B cells, suggests that design of immunogens 
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with appropriate affinities and affinity gra- 
dients can circumvent B cell competition 
as a barrier to steered maturation of bnAb- 
precursor responses. These findings provide 
support not only for the concept of germline- 
targeting priming but also for the broader 
concept of sequential vaccination to guide evo- 
lution of targeted responses. 

Our finding that VRCO1-class responses de- 
rived mostly from precursors with affinities 
better than 3 uM provides a potential bench- 
mark for other germline-targeting efforts and 
accords with preclinical mouse model data 
on eOD-GTS8 60mer and analogs (47-49), sup- 
porting the use of such models to predict hu- 
man responses. Our data on VRCO1-class and 
non-VRCO01-class precursor affinities support 
the hypothesis that relatively high affinities 
are required for consistent priming of low- 
frequency precursors, even for highly multiva- 
lent nanoparticle immunogens (73, 21, 25, 39), 
and also support the claim that precursor fre- 
quency and affinity are interdependent for 
determining B cell competitive fitness in GCs 
(47-49, 59). Given that interdependence, it will 
be important to determine whether consistent 
priming by germline-targeting immunogens 
for lower-frequency HCDR3-dominant bnAb- 
precursors (25) will require even higher affin- 
ities than those observed here. 

A germline-targeting vaccine prime should 
generate as large a pool of bnAb-like memory 
B cells as possible to facilitate successful boost- 
ing by a more native-like immunogen. Prior 
human naive B cell sorting revealed that the 
frequency of VRCO1-class B cells that bind 
eOD-GT8 with affinities better than 3 uM was 
~1in 900,000 among all human naive B cells 
(21, 43). Here, we estimated that for the 81% 
of VRCO1-class responses to ECOD-GT8 60mer 
derived from precursors with affinities better 
than 3 uM, the expansion from naive B cells 
to IgG memory B cells was ~37-fold at week 8 
(after one vaccination) and 250-fold at week 
16 (after two vaccinations), averaging responses 
from the low- and high-dose groups. These 
expansion levels provide benchmarks for other 
germline-targeting priming vaccines. The ex- 
pansion after one vaccination observed here 
was substantially stronger than the expansions 
into memory measured in five different mouse 
models with human naive VRCO1-class precur- 
sors with affinities better than 3 uM and fre- 
quencies of either 1 in 1 million [two models, 
HuGLi8s and HuGLi7, with 2-fold expansion 
and >100-fold contraction, respectively (48)] or 
1 in 10,000 [three models, CLK21, CLK19, and 
CLKO9, with expansions of about 2.5-, 4.0-, and 
5.5-fold, respectively (49)]. Hence even precisely 
calibrated mouse models can underestimate the 
degree to which human germline-targeting vac- 
cines can induce bnAb precursor-derived mem- 
ory B cell responses, which should be considered 
when evaluating other priming immunogens 
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in preclinical experiments. A thorough and 
detailed comparison of the multidimensional 
human data from this trial (frequencies, SHM, 
bnAb properties, affinities) with the results of 
the many mouse models applied to COD-GT8 
60mer vaccination is warranted and may assist 
the selection and design of models that opti- 
mally predict human responses. 

A key hypothesis underlying the germline- 
targeting vaccine design strategy is that 
sequential vaccination with increasingly more- 
native-like boost inmunogens will be capa- 
ble of inducing bnAb development by driving 
bnAb-precursor B cells to undergo repeated 
rounds of affinity maturation. This could be 
achieved if each boost stimulated memory 
B cells to return to GCs and/or supplied new 
antigen to existing GCs. Raising questions 
about the practicality of generating sufficient 
mutation by sequential immunization, results 
from mouse studies with model antigens in- 
dicated that stimulation of previously matured 
IgG memory B cells generally induces differ- 
entiation to antibody-secreting cells rather 
than GC reentry for further affinity maturation 
(60-63) and may suffer from clonality bottle- 
necks (60). Furthermore, repeated immuni- 
zation with the same antigen was found to 
increase the number of memory B cells but not 
significantly increase SHM levels or affinities 
(61). Conversely, sequential immunization with 
different immunogens can increase SHM of 
bnAb-precursor memory B cells in knockin 
mouse models (23, 29), and repeated boosting 
with HIV trimers in NHPs or with severe acute 
respiratory syndrome coronavirus 2 (SARS- 
CoV-2) mRNA vaccines in SARS-CoV-2-naive 
humans can increase SHM in antigen-specific 
memory B cell populations (64-66). Our human 
vaccination data showed that for VRCO1-class 
responses, mutation levels and affinities in IgG 
memory BCRs increased after the boost, and 
polyclonality remained high after prime and 
boost. Thus, our data suggest that maturation 
of human B cells toward bnAb development 
by sequential vaccination remains plausible, 
especially considering that heterologous boost- 
ing should generate more mutation and diver- 
sification than the autologous boost studied 
here. In the future, it will be important to de- 
fine the relative contributions of memory B cell 
reentry to GCs versus antigen refueling of GCs 
to sequential-vaccination-induced B cell mat- 
uration in humans. The degree to which one 
mechanism or the other dominates could have 
implications for optimizing design of boosting 
or shepherding immunogens and regimens and 
for developing strategies to monitor responses 
in clinical trials and preclinical models. 

With efficient priming of VRCO1-class re- 
sponses established, major challenges remain 
ahead for sequential vaccination to shepherd 
these responses to bnAb development. The 
critical step will be to induce B cells that can 
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bind fully native trimers with the N276 gly- 
can intact. Succeeding at that step will likely 
require first taking one or more smaller steps 
to advance the maturation of VRCO1-class 
bnAb precursor-derived B cells to enable bind- 
ing to increasingly more-native-like forms of 
the CD4bs (26-29, 56), consistent with our 
SPR data showing that VRCO1-class BCRs 
induced by eOD-GT8 60mer lack detectable 
affinity for a native trimer but do show at least 
low affinity for core-gp120 or eOD-GT6 var- 
iants. VRCO1-class and non-VRC01-class BCRs 
induced by eOD-GTS8 60mer should be helpful 
for identifying candidates for the first heter- 
ologous boost. 

The consistent VRCO1-class bnAb precursor 
priming demonstrated here represents an un- 
precedented level of vaccine control over the 
specificity of humoral responses and, as such, 
may herald a new era of precision vaccine de- 
sign for HIV and other pathogens. By defining 
desired B cell responses at the molecular level, 
the germline-targeting vaccine design ap- 
proach allows for a highly reductive and itera- 
tive design cycle to optimize vaccine discovery 
and development. 


Materials and methods 
Study design 


IAVI GO01, with ClinicalTrials.gov registry 
number NCT03547245, was a phase 1, ran- 
domized, double-blind, placebo-controlled 
dose escalation study to evaluate the safety 
and immunogenicity of eCOD-GT8 60mer vac- 
cine adjuvanted with ASO1, in HIV-uninfected, 
healthy adult volunteers. Two doses of 20 pg 
or 100 ug eOD-GT8 60mer with ASO1, or two 
doses of placebo were given by deltoid intra- 
muscular injection 8 weeks apart, with both 
immunizations given to the same arm. Placebo 
was the buffer used in the vaccine: DPBS con- 
taining 10% sucrose at pH 7.5. The consort 
diagram is shown in fig. S1. 

The primary objectives of the study were to 
evaluate the vaccine for safety and tolerability 
and the capacity to induce Immunoglobulin G 
(IgG) B cell responses from rare precursors for 
VRCO1-class bnAbs. The primary end points 
were the occurrence of adverse events, and the 
secondary endpoint was induction of EOD-GT8 
60mer-specific, eEOD-GT8 monomer-specific 
and CD4-binding-site (CD4bs)-specific serum 
binding antibody responses. The detection of 
VRCO1-class responses in IgG memory B cells, 
IgG GC B cells, and plasmablasts were explor- 
atory end points but nevertheless were the crit- 
ical immunological readouts to judge vaccine 
efficacy. Additional exploratory immunologi- 
cal analyses included (i) assessing the relative 
frequencies of the VRCO1-class BCRs and com- 
petitors; (ii) measuring the changes in somatic 
hypermutation and eOD-GT8 binding affinities 
over time for both types of BCRs; (iii) evaluat- 
ing a wide array of properties of the VRCO1- 


Leggat et al., Science 378, eadd6502 (2022) 


class BCRs, to assess the potential for the BCRs 
to mature into bnAbs; and (iv) assessing the 
binding of VRCO1-class and competitor BCRs 
to antigens with CD4bs epitopes closer to native 
HIV Env compared to the vaccine. 


Participants and randomization 


Eligible participants were healthy male and 
female adults aged 18 through 50 years of 
age who were willing to undergo HIV testing, 
use an effective method of contraception, 
understood the study in the opinion of the 
investigator or designee, and provided writ- 
ten informed consent. Forty-eight participants 
who met all eligibility criteria were included 
in the study and were randomly assigned to 
receive vaccine or placebo within one of two 
groups. Group 1 included 18 low dose (20 ug) 
vaccine recipients and 6 placebo recipients, 
and Group 2 included 18 high dose (100 pg) 
vaccine recipients and 6 placebo recipients. 
Twenty-four participants were enrolled at each 
of two clinical sites: George Washington Uni- 
versity (GWU) and Fred Hutchinson Cancer 
Center (FHCC). There was no attempt to match 
the participants for any demographic category 
among the three study groups or between the 
two clinical sites. Participant demographics 
are given in table S1. Among enrolled partic- 
ipants, for sex at birth, approximately equal 
numbers were male (25/48; 52.1%) and female 
(23/48, 47.9%); the predominant race reported 
was White (33/48, 68.8%), with Asian and 
Multiracial (each at 5/48, 10.4%) being the next 
highest race categories reported. The ethnicity 
of “Not Hispanic and Not Latino” was reported 
for the majority of the participants (42/48, 
87.5%). The median age and body mass index 
were 27 years and 25.6 kg/m”, respectively. 


Oversight 


The trial was conducted under an Investiga- 
tional New Drug (IND) application submitted 
to the US Food and Drug Administration, and 
was carried out in compliance with the pro- 
tocol filed within the IND. The trial adhered 
to IAVI standard operating procedures in 
accordance with the guidelines formulated by 
the International Committee on Harmoni- 
zation for Good Clinical Practice in clinical 
studies, and complied with applicable local 
standards and regulatory requirements in- 
cluding review and approval by the institu- 
tional review boards at FHCC and GWU. 
The trial was overseen by a protocol safety 
review team and independent safety moni- 
toring committee. 


Blinding 

Study site investigators, staff and volunteers 
were blinded in terms of vaccine versus pla- 
cebo. An unblinded study pharmacist at each 
site was responsible for vaccine preparation 
and accountability. Staff carrying out immu- 
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nological assays were blinded. Staff carrying out 
bioinformatic and statistical analyses were un- 
blinded, which enabled analyses to be carried 
out during the trial and led to early planning 
and preparation for follow-on trials (IAVI GO02, 
ClinicalTrials.gov Identifier: NCT05001373; and 
IAVI G003, NCT05414786). 


Safety and tolerability monitoring 


Safety and tolerability were monitored during 
the trial by site investigators, the [AVI medical 
monitor and the protocol safety review team. 
The safety and tolerability of the vaccine were 
evaluated by the safety monitoring commit- 
tee for, at minimum, the first 14 days after the 
first vaccination for all participants in the low 
dose group (group 1) before escalating to the 
higher dose level (group 2). Participants were 
followed up to 12 months after the final inves- 
tigational product administration. Adverse 
events (AEs) were grouped by Medical Dic- 
tionary for Regulatory Activities Terminology 
(MedDRA) System Organ Class (SOC) and 
Preferred Term (PT). All AEs were graded for 
the entire duration of the study, using the 
National Institutes of Allergy and Infectious 
Diseases (NIAID) Division of AIDS (DAIDS) 
Table for Grading the Severity of Adult and 
Pediatric Adverse Events, Version 2.1, July 2017. 


Immunological assays 


Serum antibody binding responses were as- 
sessed by binding antibody multiplex assay 
(BAMA), and serum antibody neutralization 
was assessed using TZM-bl neutralization 
assays. Frequencies of antigen-specific and 
CD4bs epitope-specific B cells were assessed 
by fluorescence-activated cell sorting (FACS). 
The primary immunological readout, the in- 
duction of VRCO1-class IgG B cells, was assessed 
by CD4bs-specific single B cell sorting, B cell 
receptor (BCR) sequencing, and bioinformatic 
analysis. Polyclonality and genetic diversity of 
VRCO01-class IgG BCR responses were assessed 
by bioinformatic analysis including hierarchi- 
cal sequence clustering. 


Definition of CD4bs-specific responses 


Assessment of serum or B cell binding to the 
eOD-GT8 CD4bs epitope was determined by 
differential binding to eOD-GT8 and eOD- 
GT8-KO11, a variant of eCOD-GT8 with three 
mutations in the CD4bs (N280R, S365L, and 
F371R in HxB2 numbering) that essentially 
abrogates binding by VRCO1-class precursor 
Abs and VRCO1-class bnAbs (43, 46-49). In 
SPR experiments, we measured no detectable 
binding to multiple VRCO1-class human naive 
precursors and bnAbs at concentrations up to 
30 uM of eOD-GT8-KO11 (not shown). eOD- 
GT8-KO11 was originally referred to as eOD- 
GT8-KO2 (43, 46, 47) but has subsequently 
been referred to as EOD-GT8-KO1I (48, 49). 
Additional details on how the eOD-GT8 and 
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eOD-GT8-KO11 antigens were employed for 
BAMA and B cell sorting are provided below. 


Power analysis and rationale for trial size 


Group sizes for the study were selected to mea- 
sure the primary hypothesis that the vaccine 
will induce VRCO1-class IgG B cells with a re- 
sponse rate of at least 50% in at least one of 
the adjuvanted protein vaccine arms, as well 
as to satisfy the need to have enough end 
points for further characterization of that re- 
sponse. We powered the study to have high 
probability of observing at least five partic- 
ipants with a vaccine-induced VRCO01-class IgG 
B cell response among participants receiving 
eOD-GT8 60mer in study group 1 or 2 given 
that the true response rate for this class of 
B cells was at least 50% or greater among 
eOD-GT8 60mer recipients in either arm. We 
assumed a dropout rate of approximately 10%, 
which translated into an assumption that im- 
munogenicity samples would be obtained from 
N = 16 recipients of EOD-GT8 60mer in each of 
groups 1 and 2. Under those assumptions, we 
determined that at least five positive respond- 
ers for VRCO1-class B cells would be required 
for the 95% confidence interval (CI) about the 
observed rate to be consistent with a true rate 
of 50%, because the 95% CI for an observed 
rate of 5/16 = 31.25% is 14.2 to 55.6%. Further- 
more, power was 96.2% to detect five or more 
positive responders out of 16 when the true 
rate of response was 50% (fig. S40). 


Vaccine and adjuvant 


eOD-GTS8 60mer was manufactured in accord- 
ance with current Good Manufacturing Prac- 
tice (CGMP) regulations at Paragon BioServices, 
Inc (Baltimore, MD), as described in detail else- 
where (40). In summary, cGMP manufacture 
was accomplished by a combination of puri- 
fication techniques following transient expres- 
sion in suspension-adapted, cGMP-qualified 
VRC293 human embryonic kidney 293 (HEK293) 
cells generated at SAFC (Carlsbad, CA) from a 
Master Cell Bank generously provided by the 
National Institutes of Health (NIH) Vaccine 
Research Center (VRC) within the National 
Institute of Allergy and Infectious Diseases 
(NIAID). VRC293 cells were grown in serum- 
free Expi293 medium (Thermo Fischer), trans- 
fected with plasmid DNA encoding eOD-GT8 
60mer (cGMP manufactured by Aldevron; 
Fargo, ND) and Polyethylenimide PEIpro-HQ 
transfection reagent (CGMP manufactured by 
PolyPlus-Transfection SA; Ilkirch, France), and 
expression was carried out in the presence of 
14 uM Kifunensine (CGMP manufactured by 
GlycoSyn; Graceville, New Zealand). Benzo- 
nase endonuclease enzyme (high-purity grade, 
250 U/l, from Millipore; Burlington, MA, USA) 
was employed to remove residual host cell 
and plasmid DNA. The eOD-GT8 60mer cGMP 
clinical material was formulated at 1 mg/ml in 
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10% sucrose in phosphate-buffered saline (PBS) 
at pH 7.2, aliquoted at 0.4 ml volume in 2 ml 
type 1 glass vials with stoppers (13 mm stopper, 
Rubber with Flurotec, Afton Scientific), sealed 
with sterile seals from Afton Scientific Cor- 
poration (Charlottesville, VA), and stored at 
-80°C. The material is currently on a stability 
testing program as per regulatory guidelines 
(stable for >36 months). 

Quality control procedures were performed 
on the manufactured eOD-GT8 60mer nano- 
particle to confirm its identity, determine pro- 
tein concentration and purity, establish in vitro 
potency, measure the nanoparticle size, and 
characterize N-linked glycans. Additionally, 
the clinical trial lot was tested to quantify host 
cell residual impurities (host cell proteins and 
host cell DNA), measure bioburden and bacte- 
rial endotoxin, determine subvisible particulate 
matter, and confirm sterility. Quality control 
procedures included sandwich ELISA with 
GL-VRCO01 for potency; high pressure liquid 
chromatography (HPLC) with tandem mass 
spectrometry (MS/MS) to confirm amino acid 
sequence; N-terminal Edman sequencing for 
further sequence confirmation; size exclu- 
sion chromatography (SEC) to assess purity; 
sedimentation velocity analytical ultracentrifu- 
gation (AUC-SV) for particle size distribution; 
dynamic light scattering (DLS) for average 
particle size; and hydrophobic interaction 
liquid chromatography coupled to mass spec- 
trometry (HILIC-FLD-MS/MS) for N-linked 
glycan profiling. eEOD-GT8 60mer cGMP ma- 
terial had full glycan occupancy at only 5 of 
10 glycosylation sites and had partial occu- 
pancy at three sites (40). Preclinical material 
that performed well in mouse models had 
full glycan occupancy at three sites and partial 
occupancy at four sites (43). 

ASO1, adjuvant is an adjuvant system com- 
posed of two immunoenhancers combined in 
a liposomal formulation consisting of dioleoyl 
phosphatidylcholine (DOPC) and cholesterol 
in phosphate-buffered saline solution. The 
immunoenhancers are (i) 3-O-desacyl-4’- 
monophosphoryl lipid A (MPL), a derivative of 
lipopolysaccharide from the Gram-negative bac- 
terium Salmonella minnesota, and (ii) a saponin 
molecule (QS-21) purified from the bark of the 
tree Quillaja saponaria Molina. ASO1g was 
manufactured and provided by GlaxoSmithKline 
Biologicals (Rixensart, Belgium). QS-21 was li- 
censed by GSK from Antigenics LLC, a wholly 
owned subsidiary of Agenus Inc. (Delaware, 
USA). The administered dose of ASO1, corre- 
sponded to 50 ug each of MPL and QS-21. 

DPBS with 10% sucrose was manufactured 
by SAFC Biosciences. 


Study procedures 
Vaccine preparation 


Vaccine was diluted to the appropriate dose 
and mixed with adjuvant just prior to admin- 
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istration. For group 1, the low dose (20 ug) 
group, eEOD-GT8 60mer investigational product 
(IP) was first diluted 1:5 with DPBS Sucrose; 
then 0.15 ml of this diluted IP was transferred 
into a vial containing 0.65 ml ASO1,; and fi- 
nally, after gentle inversion to mix, 0.6 mL was 
withdrawn into a syringe and administered 
as an injection in the deltoid muscle of the 
nondominant arm. For group 2, the high dose 
(100 ug) group, 0.15 ml of eEOD-GT8 60mer IP 
was transferred into a vial containing 0.65 ml 
ASOl1p, and after gentle inversion to mix, 0.6 ml 
was withdrawn into a syringe and administered 
as an injection in the deltoid muscle of the 
nondominant arm. For all placebo assign- 
ments, 0.6 ml of DPBS sucrose was withdrawn 
into a syringe and administered as an injec- 
tion in the deltoid muscle of the nondominant 
arm. Vaccine or placebo were administered 
at day O and week 8 (day 56 + 7); for each 
participant, the first and second injections 
were administered in the same arm. 


Schedule of procedures 


The full schedule of procedures is given in 
table S2. 


Safety and tolerability 


Weekly follow-up visits were scheduled for 
study weeks 1 to 4 and 9 to 11, and additional 
follow-up visits were scheduled for weeks 16, 
20, 32, and 56. Participants recorded local and 
systemic reactogenicity using a memory aid 
from day 0 through day 7 after vaccination. At 
each vaccination visit, vital signs were mea- 
sured by study staff prior to vaccination and 
at least 30 min postvaccination. Unsolicited 
adverse events (collected through open-ended 
questions) were collected from day 0 through 
28 days after the second (final) vaccination. 
Serious adverse events, medically attended 
adverse events and potential immune-mediated 
diseases (pIMDs) were collected during the 
entire study period through 12 months after 
the second dose administration. Potential 
immune-mediated diseases were a subset of 
adverse events that included autoimmune dis- 
eases and other inflammatory and/or neuro- 
logic disorders of interest which may not have 
had an autoimmune etiology. 


Immunological sample collection 
and storage 


Leukapheresis was performed at two time 
points, once during screening at week —4 and 
once at study week 10 (approximately 14 days 
after the second vaccination). Peripheral blood 
mononuclear cells (PBMC) were collected at 
weeks —4, 4, 8, 10, and 16 after the first vac- 
cination by leukapheresis or by venipuncture 
(whole blood with ACD anticoagulant) and 
were isolated by density gradient centrifuga- 
tion and cryopreserved as aliquots of 20 x 10° 
or 50 x 10° cells. Separately, PBMC aliquots 


15 of 28 


RESEARCH | RESEARCH ARTICLE 


obtained by a single leukapheresis from an 
unvaccinated HIV-negative volunteer served 
as an internal negative control for flow cy- 
tometry panel and probe staining for every 
experiment. Ultrasound-guided fine needle 
aspirations (FNAs) of axillary lymph node(s) 
were performed at two time points, approx- 
imately 21 days after each vaccination. The 
procedure was performed by a board-certified 
radiologist using ultrasound guidance to avoid 
needle insertion into any adjacent structures. 
Plasmablast samples (10 ml PBMCs) were 
collected at week 9 (5 to 8 days after the sec- 
ond vaccination). Lymph node (LN) FNA and 
plasmablast samples were stored on wet ice 
but were not frozen before being subjected 
to cell sorting analyses within 24 hours of 
collection. 


Fine needle aspiration from draining axillary 
lymph nodes 


Aspirates were collected from the draining 
axillary lymph nodes by fine needle aspira- 
tion (FNA) at weeks 3 and 9. Ultrasound was 
used to visualize the axillary lymph nodes on 
the same side as the site of the most recent 
vaccination for lymph node sampling. Follow- 
ing site-specific standard FNA procedures, the 
overlying skin was swabbed generously with 
chlorhexidine, betadine, or similar skin disin- 
fectant solution, and 1% lidocaine was injected 
subcutaneously as local anesthetic. A 22-gauge 
lumbar puncture needle attached to a 5 ml 
sterile syringe was then inserted into the largest 
and most accessible lymph node, and nega- 
tive pressure was applied by withdrawing the 
syringe lunger approximately 2 to 3 ml while 
collecting sample over a 30 s period of “to and 
fro” needle movement, or until first appear- 
ance of blood in syringe. Immediately after 
withdrawal of each needle from the study par- 
ticipant, syringe and needle contents were 
expelled into a 50 ml conical tube containing 
R10 media, flushing out any remaining cells out 
of the syringe by detaching the lumbar needle 
from the syringe, withdrawing approximately 
2 to 3 ml of fresh R10 media into the syringe 
barrel, reattaching it to the needle, and expell- 
ing the media into the 50 ml conical tube; this 
flushing procedure was repeated a total of four 
times for each aspiration. The collection was 
repeated up to four times for each study par- 
ticipant, using a new sterile needle and syringe 
for each pass. After collection, the sample was 
transported to the laboratory on wet ice or cold 
packs as soon as possible. Lymph node FNA 
samples ranged in recovery from very few cells 
up to 4 x 10” cells (median recovery varied 
between sites from 2.4 x 10° to 4.7 x 10°). 


Preparation of lymphocytes from fine needle 
aspirates from axillary lymph nodes 


The 50 ml conical FNA sample was spun at 
325 x g for 10 min at 4°C (no brake). Super- 
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natant was gently aspirated with a serological 
pipet down to about 500 ul. If the cell pellet 
had any redness, cells were resuspended in 
5m of cold 1x red blood cell (RBC) lysis buffer 
(eBioscience) and incubated for 5 min at am- 
bient temperature while agitating the sample by 
gently pipetting up and down periodically with 
a P1000 pipette or using a tube rocker. Cells 
were washed with 40 ml of cold 10% FBS/ 
1xPBS. Samples were centrifuged at 325 x g 
for 10 min at 4°C. Supernatant was gently 
aspirated with serological pipette and the 
cell pellets were resuspended with 2 ml of cold 
10% FBS/1xPBS. Cell numbers and viability of 
the sample were determined using a hemo- 
cytometer, Muse cell analyzer, or Nexcelom 
Auto 2000. 


Production of protein reagents for 
immunological assays 
Reagents for B cell sorting 


Avi-tagged and biotinylated versions of eOD- 
GT8 monomer (27, 39) and eOD-GT8-KO11 
monomer (different than the original eOD- 
GTS8-KO) (43, 46) were produced by a cGLP lab 
under contract at Scripps Research. These pro- 
teins were produced as previously described 
(73) in FreeStyle 293F (Invitrogen) suspension 
cultures by transient transfection using 293Fec- 
tin (Invitrogen) of a pHLSec plasmid contain- 
ing mammalian codon-optimized eOD with a 
C-terminal His6~ affinity tag followed by an 
Avi tag. Protein was harvested from the super- 
natant after 5 days and purified by affinity 
chromatography with a HIS-TRAP column 
(GE Healthcare) followed by Superdex 75 size 
exclusion chromatography (GE Healthcare) 
using an AKTA chromatography system (GE 
Healthcare). Biotinylation was accomplished 
using BirA (Avidity), and the level of biotinyla- 
tion was estimated by PAGE. 


Reagents for BAMA 


Avi-tagged eOD-GT8 monomer was pro- 
duced as described above. Non-avi-tagged 
eOD-GT8-KO11 monomer was produced fol- 
lowing similar methods but was produced in 
the Schief lab at Scripps Research. EOD-GT8 
60mer for BAMA assays was also produced in 
the Schief lab, in FreeStyle 293F (Invitrogen) 
suspension cultures by transient transfection 
using 293Fectin (Invitrogen) of a pHLSec plas- 
mid containing mammalian codon-optimized 
eOD-GTS8 60mer. Protein was harvested from 
the supernatant after 5 days and purified by 
lectin chromatography followed by size ex- 
clusion chromatography with a Superose 6 or 
Superose 6 Increase column (GE Healthcare). 
Lumazine synthase nanoparticle was pro- 
duced by BlueSky Bioservices, in Escherichia 
coli, using heat treatment (75°C for 30 min) 
of the supernatant from lysed cells followed 
by centrifugation to again obtain superna- 
tant, and then size exclusion chromatog- 
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raphy first with Superdex 75 and finally with 
Sephacryl 500. 


Serum binding analysis by BAMA 


Serum IgG responses were measured via a mul- 
tiplex antigen panel (i.e., EOD-GT8, eOD-GT8- 
KOI1, eOD-GT8 60mer, Lumazine synthase) to 
determine specificity and magnitude. A binding 
antibody multiplex assay (BAMA) (67-69) was 
modified by lengthening the primary antibody 
incubation period from 30 to 120 min for en- 
hanced detection sensitivity of early bnAb pre- 
cursors. The assay was validated for accuracy, 
specificity, precision, robustness, and limit 
of detection/quantitation (LLOD/LLOQ). The 
LLOQ for detection of early bnAb precursors 
was 0.0041 to 0.0866 g/ml. Antibodies were 
measured at day 0 (day of first vaccination), 
day 14 (2 weeks post-first vaccination), day 28 
(4 weeks post-first vaccination, day 56 (day of 
second vaccination), day 70 (2 weeks post-second 
vaccination), and day 112 (8 weeks post-second 
vaccination). Samples were serially diluted (1:50, 
1:250, 1:1250, 1:6250, 1:31250, and 1:156250) 
and incubated for 120 min with a mixture of 
carboxlyated fluorescent MagPlex microsphere 
sets (Luminex) that were each covalently coupled 
to one of the antigens. Antigen-specific IgG 
was detected using a biotinylated detection 
antibody to human IgG Fe (Southern Biotech), 
followed by washing and incubation with 
Streptavidin PE (BD Pharmingen). Samples 
were acquired on a Bio-Plex instrument (Bio- 
Rad), and antibody levels were measured as 
median fluorescent intensity (MFI) from two 
wells and then averaged using the mean. The 
readout was background-subtracted median 
fluorescent intensity (MFT), where background 
refers to the antigen-specific plate-level control 
(ie., a blank well containing antigen-conjugated 
beads run on each plate plus detection anti- 
body). Additionally, a blank or reference bead 
was included to estimate nonspecific antibody 
binding. Area under the titration curve (AUTC) 
was used as the magnitude measure of inter- 
est. AUTC was calculated using the trapezoid 
method with truncation in the case of negative 
background-adjusted MFI minus background- 
adjusted blank MFI (MFI*) values or MFI* 
values > 22000. Samples with blank MFI > 
5000 were excluded from the analysis. The 
positive controls were GL-VRCO1 (germline 
VRCO1 mAb), VRCO1 mAb, anti-6~= HIS epi- 
tope tag (for His-tagged proteins), CH31 mAb, 
2G12 mAb. 

The positivity of a response was defined 
based on background-adjusted MFI values 
at the screening dilution level (1:50) except 
lumazine synthase. Lumazine synthase was 
observed to have high baseline values across 
participants at the screening dilution, so filter- 
ing for high baseline MFI* was not applied and 
the response call for this antigen was made at 
the first dilution for which baseline MFI* was 
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<6500. Antigen-specific positivity thresholds 
were computed as the maximum of 100 and the 
95th percentile of the baseline MFI* by antigen 
except lumazine synthase, which was set to 
100. Samples from postenrollment visits were 
declared to have a positive binding antibody 
response by BAMA if they met three criteria: 

1. MFI* values were greater than the antigen 
specific positivity threshold. 

2. MFI* values were greater than three times 
the baseline MFI* values. 

3. Background-adjusted MFI values were 
greater than three times the baseline background- 
adjusted MFI values. 

For differential binding to the CD4 binding 
site (CD4bs), defined as the difference in AUTC 
for binding to ECOD-GT8 (AUTCrer) and eOD- 
GT8-KOI1 (AUTCxo), the positivity of response 
was defined as a positive response to EOD-GT8 
plus an additional criterion: 

4. AAUTC > 0, where AAUTC was defined as 
AUT Caer — AUTCxo, 


Pseudovirus production and neutralization assays 


Neutralizing antibodies against HIV-1 pseudo- 
viruses were measured as a function of reduc- 
tion in Tat-regulated luciferase (Luc) reporter 
gene expression in TZM-bI cells (70, 71). Neu- 
tralization ID50 titers were measured from 
specimens obtained at days 0 (visit three, 
baseline), 14 [visit four, 2 weeks post-first vac- 
cination), and 70 (visit 8, 2 weeks post-second 
(final) vaccination]. Neutralization titer was 
defined as the serum dilution at which relative 
luminescence units (RLUs) were reduced by 
50% (ID50) relative to the RLUs in virus con- 
trol wells (cell + virus only) after subtraction of 
background RLUs (wells with cells only). 

A specialized panel of 426c Env-pseudotyped 
viruses was used to detect and characterize 
early intermediates of VRCOl1-class bnAbs. 
Some VRCO1 early intermediates neutralize 
426c in a manner that requires deletion of one 
or more N-glycans in the vicinity of the CD4bs. 
Neutralization may be enhanced by Man5- 
enrichment of remaining N-glycans that other- 
wise are processed into larger complex-type 
glycans. Man5-enrichment was achieved by 
producing Env-pseudotyped viruses in 293S 
GnTI cells. 293T cells were used to produce 
Env-pseudotyped viruses with fully processed 
glycans. Mutation D279K was used to confirm 
CD4bs specificity. 

The viruses 426c.N276D/GnTI- and 426c. 
N276D.N460D.N463D/GnTI can detect 
germline-reverted forms of VRCO1, VRCO7, 
and VRC20 but not germline-reverted forms 
of other CD4bs bnAbs (72). 

A specialized panel of CH505 Env-pseudotyped 
viruses was used to detect and characterize early 
intermediates of CH235 and CH103 bnAb line- 
ages. Neutralization by CH235 early inter- 
mediates is dependent on an N279K.G458Y 
double mutation and also requires Man5- 
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enrichment of N-glycans that are otherwise 
processed into larger complex-type glycans. 
Man5-enrichment was achieved by produc- 
ing Env-pseudotyped viruses in 293S GnTI- 
cells. 293T cells were used to produce Env- 
pseudotyped viruses with fully processed 
glycans. Mutation N280D was used to con- 
firm CD4bs specificity. CH505.N279K.G458Y/ 
GnTI detects germline reverted and early 
intermediates of CH235 but not germline- 
reverted forms of other CD4bs bnAbs (73). 

Neutralization by early intermediates of 
CH103 requires deletion of four N-glycans 
in the vicinity of the CD4bs, combined with 
Man5-enrichment of remaining N-glycans that 
otherwise are processed into larger complex- 
type glycans. Mutation S365P was used to con- 
firm CD4bs specificity. CH505TF.gly4/GnTI- 
detects germline-reverted and early interme- 
diates of CH103 (73). 


Statistical analysis of neutralization assay data 


Response to a viral isolate was considered 
to be positive if the neutralization titer was 
greater than or equal to 20. Response rates 
were calculated with 95% Wilson score inter- 
vals. Response magnitude analyses included 
vaccine-recipients only and included respond- 
ers and nonresponders (with truncated titer 
measurements). There were no detectable neu- 
tralizing antibody responses to the eOD-GT8 
vaccine (low-dose or high-dose) that indicated 
elicitation of early intermediates of VRCO1- 
class, CH235, or CH103 bnAbs. There was one 
low-level low-dose response (ID50 = 22.47) 
to CHO505TF gly4.S365P.2/293S/GnTI (CH103 
precursor knockout mutant) after the second 
vaccination, but this was hard to interpret in 
the absence of a response to CHO505TF.gly4 
(CH103 precursor). Negative results do not nec- 
essarily indicate a lack of bnAb precursors, 
since the virus panel only detects a subset of 
CD4bs bnAb precursors. There were no tier 1A 
(MW965.26/293T/17 and MN.3/293T/17), pla- 
cebo, or baseline responses. 

Due to overall lack of neutralization response, 
response rate and magnitude testing were not 
performed. 


Overview of the VRCOI-class B cell assay 


VRCO1-class B cell precursors were identified 
first based on their ability to bind to the CD4bs 
of the eOD-GTS8 protein, determined using 
fluorescently-labeled eOD-GTS8 proteins and a 
modified eOD-GTS8 protein containing mutations 
specifically in the CD4bs epitope (eOD-GT8 
KOl11). Differential staining of B cells to the 
eOD-GTS protein but not to the CD4bs mutant 
(eOD-GT8 KO11) indicated that B cells were 
specific to the CD4bs. Therefore, fluorescently- 
labeled eOD-GT8 and eOD-KO11 were com- 
bined with a flow cytometry antibody panel to 
identify and single-cell sort the specific B cell 
populations of interest: circulating memory 
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B cells, plasmablasts and germinal center (GC) 
B cells. CD4bs-specific B cells bound two eOD- 
GTS8 tetramers (GT8**) but not tetramers of 
eOD-GTS8 KO1I1 (KO ). Since the B cell sorting 
and sequencing needed to be performed on 
fresh samples for plasmablasts and lymph 
node (LN) FNAs, the sorting and sequencing 
was established and performed at two labo- 
ratories that were in close proximity to the 
clinical sites. Procedures and reagents were 
harmonized (except where noted otherwise) 
and verified at the two laboratories (figs. S5 
to S10 and tables S10 to S19). 

PBMC samples from weeks —4, 4, 8, 10, and 
16 after the first vaccination were sorted for 
CD4bs-specific IgG memory B cells; draining 
axillary lymph node cells, acquired by FNA at 
weeks 3 and 11, were sorted for CD4bs-specific 
IgG germinal center (GC) B cells; and PBMC 
samples at week 9 (5 to 8 days after the second 
vaccination) were sorted for CD4bs-specific 
IgD plasmablasts (PBs) (Fig. 1A and tables S10 
and S11). For each sample, RT-PCR and DNA 
sequencing were applied to as many CD4bs- 
specific IgG cells as possible, up to a maximum 
of two 96-well plates. DNA for HCs and LCs 
(kappa and lambda) was sequenced by the 
Sanger method, and selected samples from the 
low dose vaccine group were resequenced using 
next-generation sequencing. BCR sequences 
were subjected to quality filtering and bioin- 
formatic analysis. Additional details of these 
procedures are described below. 


Tetramer probe preparation 


Tetramers were prepared at a 4:1 molar ratio 
of monomeric eOD-GTS8 or KO1I proteins to 
streptavidin. The total volume of protein, 1x 
PBS, and protease inhibitor (catalog no. 539131- 
10VL; MilliporeSigma; Burlington, MA, USA) 
was added to a microcentrifuge tube. Twenty 
percent of the total streptavidin volume was 
added and incubated with continuous rota- 
tion for 20 min at 4°C in the dark. The incre- 
mental addition of streptavidin was repeated 
four times until the total amount of strepta- 
vidin had been added to the protein. Tetra- 
mers were made fresh for each experiment 
(VRC) or stored at 4°C and kept up to 1 month 
(FHCC). 


Monoclonal bead controls 


Bead assays were performed on the day of 
sorting experiments to confirm the function- 
ality of the tetramers by flow cytometry. Anti- 
mouse Ig kappa beads (BD Bioscience, La Jolla, 
CA, USA) were washed with 3 ml of R10 in 
polystyrene FACS tubes and then centrifuged 
at 650 x g for 5 min. Supernatants were dis- 
carded and beads were resuspended in 100 ul of 
R10. Each bead control was given 1 ug of mouse 
anti-human IgG and incubated for 15 min 
at room temperature. Beads were washed with 
3 ml of R10 and resuspended with either 1 pg 
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of glVRCO1 mAb (to test EOD-GT8 probes) or 
KG064 mAb (to test EOD-GT8-KO1II probes) in 
100 ul. Beads were incubated for 15 min at 
room temperature, washed with 3 ml of R10, 
and then resuspended in 100 ul of R10. Both 
eOD-GTS8 probes were added to the glVRCO1 
positive control beads as well as negative con- 
trol beads that did not receive gIVRCO1. The 
KO11 probe was added to the KG064 mAb- 
labeled beads as well as a bead-only control 
that did not receive KGO64. Each tube was 
incubated for a minimum of 25 min at 4°C. 
Beads were washed with 3 ml of R10, cen- 
trifuged, decanted, and resuspended in up to 
300 ul R10 for collection. 


Enrichment of B cells from cryopreserved 
PBMC samples 


All five cryopreserved PBMC samples from a 
single participant were sorted in an experi- 
mental batch on a given day. Cryopreserved 
PBMCs were thawed, either by water bath or 
with use of Thawsome cryovial adapters, into 
warm benzonase (MilliporeSigma; Burlington, 
MA, USA) supplemented R10 [RPMI 1640 
with 25 mM HEPES buffer and L-glutamine 
(Thermo Fisher Scientific; Waltham, MA, USA) 
and 10% fetal bovine serum (FBS; Nucleus 
Biologics, San Diego, CA, USA)]. Samples were 
centrifuged at 300 x g for 12 min. The super- 
natant was decanted, and the cells were washed 
with R10. The cell recovery and viability were 
optionally determined using the Muse Cell 
Analyzer (MilliporeSigma; Burlington, MA, 
USA). B cells were enriched using the EasySep 
Human Pan-B Cell Enrichment Kit (StemCell 
Technologies; Vancouver, CA) and the Big Easy 
magnet (StemCell Technologies; Vancouver, 
CA) following the manufacturer’s instructions 
summarized here. Samples were resuspended 
at a concentration of 5 x 10” cells/ml with 
EasySep buffer. Fifty microliters of Enrich- 
ment Cocktail (per ml of sample) were added 
and mixed. The samples were incubated at 
room temperature for 10 min. Magnetic par- 
ticles were vortexed for 30 s, and 75 ul of mag- 
netic particles (per ml of sample) were added 
and mixed with the samples and incubated at 
room temperature for 5 min. EasySep Buffer 
was added to the sample up to 5 ml for sam- 
ples under 2 ml (<10° cells) or 10 ml for samples 
greater than or equal to 2 ml (>10° cells). The 
sample was loaded into the EasySep magnet 
without the lid and incubated at room temper- 
ature for 5 min. With the tube on the magnet, 
the cell suspension was pipetted off and into a 
new conical. The sample conical was removed 
from the magnet, and the beads were resus- 
pended with the same amount of EasySep 
Buffer used previously and then mixed and 
reincubated on the EasySep magnet (without 
lid) at room temperature for 5 min. With the 
tube on the magnet, the cell suspension was 
pipetted off and combined with the previous 
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matching sample. The cell number and viability 
of the sample was optionally determined. 


Flow cytometry staining procedures 


Cells, either fresh PBMCs from the plasma- 
blast time point (week 9), fresh lymph node 
mononuclear cells (from FNAs at weeks 3 
and 11), or enriched B cells from previously 
cryopreserved samples (at weeks —4, 4, 8, 
10, and 16), were resuspended in 100 pl 10% 
FBS/1x PBS and stained with the fluorescently 
labeled eOD-GTS8 KO11 tetramer for 30 min at 
4°C. Cells were washed with 10% FBS/1xPBS 
and then resuspended in a staining cocktail of 
antibodies and remaining eOD-GTS8 tetramers 
diluted in Brilliant Buffer (BD Bioscience, La 
Jolla, CA, USA). The antibody staining panels 
were tissue dependent and included a panel 
for PBMC samples (memory B cells and plasma- 
blasts) (table S10) and a separate panel for 
lymph node FNAs (table S11). After surface 
staining, the samples were resuspended at 
about 2 x 10° cells/ml in R10 containing the 
viability dye (7AAD) and maintained at 4°C 
until processed by flow cytometry. 


Standardized templates for flow 
cytometry analysis 


Standardized templates were designed using 
BD Diva software for analysis of bead assays, 
internal controls, and experimental samples 
for each sample type (fig. S7). These templates 
were replicated for each flow cytometry ex- 
periment to ensure consistent gating and label- 
ing of samples. Templates were tested and 
compared between both test sites using ali- 
quots of the internal control sample before 
trial samples were used. Since antigen-specific 
events were rare in the internal control sample 
which made setting gates difficult, we devel- 
oped a standardized gating system that could 
be implemented at both sites referred to as the 
M.A.R.LO gate. 


Mathematically articulated and reasoned but 
independently optimized (M.A.R.I.0.) gating 


The analysis of fresh samples for the clinical 
trial required the use of equipment that was 
located near the clinical trial sites. The phys- 
ical differences between flow cytometers, and 
the subjectivity of gating rare and hard to 
identify cell populations, can lead to high levels 
of variability in the sorting of populations like 
antigen-specific B cells. For this trial, we de- 
veloped a method of standardizing the gating 
of antigen-specific B cells by flow cytometry. 
eOD-GTS proteins were conjugated to two flu- 
orophores, allowing us to identify fluores- 
cently double-positive events and reduce the 
fluorophore-specific background that would 
be selected by using either fluorophore indi- 
vidually. Due to the variability between flow 
cytometers, developing a method to standard- 
ize the gating of antigen-specific events by 
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flow cytometry was important for this trial. 
Here we utilized FMP (fluorescence minus 
probe) stained samples (fully stained samples 
without the addition of fluorescent probes) to 
determine the background for each channel 
being used to detect antigen-specific B cells. 
We then developed a gate based on the ratio 
of these fluorophores for each flow cytome- 
ter which could identify fluorescently well- 
balanced antigen-specific B cells while also 
reducing the amount of antigen-non-specific 
B cells captured by the gate. 

FMP samples were used to calculate the 
coordinates for the 99th percentile of each 
probe-specific fluorophore using Flowjo soft- 
ware. This serves as the upper threshold of the 
probe negative gate (fig. S7A; C1). Similarly, 
twice the 99th percentile of each probe specific 
fluorophore was determined to serve as the 
lowest acceptable threshold for probe double 
positive-specific events (fig. S7A; C2). Axis 
points for each individual fluorophore were 
placed at coordinates representing 10 times 
the 99th percentile for each individual fluo- 
rophore (fig. S7A; C4, C5). Additional axis points 
were added which are based on the propor- 
tion of the 99th percentiles for each fluoro- 
phore (fig. S7A; C6, C7). These bounds were 
used to define the area in which acceptable 
fluorescent probe double positive events were 
acceptable for sorting by each test facility and 
account for each facility’s individual cytometer 
characteristics (fig. S7B). 


FACS 


Individual target cells were index-sorted into 
empty skirted 96-well plates (Thermo Fisher 
Scientific; Waltham, MA, USA). Several wells 
were left empty for subsequent PCR controls. 
PBMC samples from weeks —4, 4, 8, 10, and 
16 after the first vaccination were sorted until 
either four 96-well plates were filled with 
eOD-GT8 CD4bs-specific IgG memory B cells 
(CD19* CD20* IgD” IgG* KO” GT8*"; fig. S8) 
or the sample was exhausted; two 96-well 
plates were subjected to RT-PCR and BCR 
sequencing, while the other two plates were 
reserved at —80°C. The number of PBMCs 
thawed and stained depended on the time 
point: 200 x 10° (FHCC) or 250 x 10° (VRC) 
at week —4, 100 x 10° at week 4, 100 x 10° at 
week 8, 200 x 10° at week 10, and 100 x 10° at 
week 16. 

In planning the trial procedures, we expected 
that GC B cells from LN FNAs and plasmablasts 
(PBs) would have lower levels of surface IgG, 
and hence we were uncertain if sorting of 
these samples would have sufficient detectable 
antigen-specific staining by flow cytometry. 
We therefore decided to sort two plates of 
phenotype-specific cells without regard to 
tetramer binding, followed by two plates of 
phenotype-specific cells that were also CD4bs- 
specific as determined by differential tetramer 
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staining. Therefore, up to two plates of total 
GC B cells (CD19* IgD™ IgG* CD20" CD38") 
were sorted, as well as CD4bs-specific GC B cells 
(CD19* IgD~ IgG* CD20" CD38" KO™ GT8**) 
(fig. S9A). From fresh PBMCs at week 9, two 
plates of PBs (CD19* CD27* IgD~ CD38") were 
sorted without regard to tetramer binding 
followed by the sorting of CD4bs-specific PBs 
(CD19* CD27" CD38" IgD~ KO” GT8""*) (fig. 
S9B). Note that the PB gating strategy, which 
included CD38" and did not gate on CD20, 
may have included CD38" cp20" B cells 
that have been described as activated B cells 
as opposed to PBs (J0). A maximum of four 
96-well plates of B cells were sorted for cDNA 
synthesis per time point. Plates were sealed, 
centrifuged at 800 x g for 1 min and imme- 
diately transferred to dry ice. Plates were stored 
at -80°C for at least overnight before proceed- 
ing. A total of 884 (384 at VRC; 500 at FHCC) 
96-well plates of sorted cells were produced 
and stored. 


Standardized and direct analysis of sorting flow 
cytometry data 


To allow for direct analysis of the flow cy- 
tometry data as it was analyzed during the 
sorts, the flow cytometry experiments were 
exported from Diva software as XML files 
along with the FCS files for the total recorded 
events and the index files for the sorted cells. 
These primary data were processed and con- 
catenated at the Vaccine Immunology Statis- 
tical Center (VISC), using the Cytoverse suite of 
R packages. CytoML (74) is an R/Bioconductor 
package that enables cross-platform import, 
export, and sharing of gated cytometry data. 
Using CytoML we parsed the XML files created 
during the Diva experiments, which contained 
data transformations, compensation matrices, 
gates and their hierarchical relationships, sam- 
ple meta-data, and other information required 
to reproduce the gating analysis. Once the 
workspaces were imported into R, the analysis 
was faithfully reproduced for each sample, the 
gated cytometry data was visualized, and pos- 
itive proportions for all populations were 
calculated. As a quality control measure, the 
calculated percent positive populations were 
compared to exported tables of populations 
from the Diva software using Spearman cor- 
relations. This data was processed using R 
(v4.1.1) and Cytoverse packages CytoML (v2.5.4), 
flowWorkspace (75) (v4.5.3), ggcyto (76) (v1.20.0), 
and cytolib (v2.5.3). Frequencies were measured 
either during the period of sorting (FHCC) or 
until the samples were exhausted (VRC). 


PCR and sequencing positive and 
negative controls 


Early in the trial, the lysates from an immor- 
talized VRCO1-class naive B cell line, referred 
to as Immo-A1 (77), and sorted pooled donor 
CD19* B cells were used as PCR controls. 
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These were found to be insufficient as con- 
trols, so we developed synthetic positive con- 
trols to use for the remainder of the trial. 
Synthetic positive control DNA sequences com- 
patible with our nested PCR protocol were 
generated for B cell receptor heavy and both 
light chains (kappa and lambda). Synthetic 
sequences were based on natural sequences 
identified from VRCO01-class antibody sequences 
isolated from human PBMCs. The CDR3 amino 
acid junctions were altered to distinguish 
each of the synthetic controls from naturally 
occurring sequences and from each other by 
replacing them with amino acid “VRC,” “VIP,” 
and “DL” segments separated by chain-specific 
amino acids. Heavy chain CDR3 sequence tags 
coded for histidine (H) leading and separating 
the segments. Similarly, kappa chain CDR3 
sequence tags coded for lysine (K) and lambda 
chain CDR3 sequences coded for leucine (L) 
leading and separating the segments. The se- 
quence tags are shown below, and the full nu- 
cleotide sequences for the synthetic controls 
are given in table S19. 

Sequence tags: 

1) 5' GVRCGVIPGDL 3’ 5’ ggcgtgcgctgcggcet- 
gattccggecgatctg 3’ 

2) 5' KVRCKVIPKDL 3' 5’ aaagtgcegctgcaaag- 
tgattccgaaagatctg 3’ 

3) 5' LVRCLVIPLDL 3’ 5’ ctggtgcgctgcctgg- 
tgattccgctggatctg 3’ 

To further differentiate the synthetic con- 
trols by sequence as well as by physical size 
using gel electrophoresis for quality control 
purposes, novel sequences were inserted at 
both ends of the sequences. Inner and outer 
primer binding sites were attached to ensure 
these sequences would amplify with our nested 
PCR primer sets. Ultimately control sequences 
were ~50% mutated from natural sequences 
distributed throughout the sequence to ensure 
identification even with high mutation and 
poor sequence quality. 

Synthetic controls were tested for primer 
binding capacity, in frame amino acid transla- 
tion, Ig chain homology and gene usage, and 
potential gBlock production issues. All con- 
trols were titered and rigorously tested prior 
to use in the trial. 


Single-cell sequencing 


In preparation for cDNA synthesis, plates were 
removed from -80°C storage and again cen- 
trifuged at 800 x g for 1 min before being 
placed on wet ice. The cDNA synthesis reac- 
tion mix was constructed using the SuperScript 
III Reverse Transcriptase Kit (Invitrogen). The 
reaction mix consisted of 1x SuperScript III 
First Strand Buffer, 4.8 mM DTT and 200 U 
SuperScript III, as well as 20 U RNaseOUT 
(Invitrogen), 450 ng Random Hexamers (Gene 
Link) and 0.77 mM dNTP mix (GeneAmp). The 
mix was added to every well, including wells 
12E through 12H to be used as negative con- 
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trols. Thermal cycling parameters were 42°C 
for 10 min, 25°C for 10 min, 50°C for 60 min, 
and 94°C for 5 min. Once complete, plates 
were frozen at —-20°C. A total of 531 (267 at 
VRC, 264 at FHCC) unique cDNA plates were 
generated for subsequent PCR amplification. 
Amplification proceeded by three immuno- 
globulin chain-specific PCRs. The reaction mix 
generally consisted of 2 U HotStarTaq Plus 
DNA polymerase and 1x HotStar Plus PCR 
Buffer (Qiagen; Hilden, Germany), as well as 
0.25 mM dNTP mix and 1.5 mM MgCly. Each 
chain was amplified using a chain-specific pool 
of forward primers (IDT) and isotype-specific 
reverse primers (IDT) (tables $13, S15, and S17). 
One reaction mix each was constructed con- 
taining 2.1 uM forward and 0.2 uM IgH reverse 
primers, 0.3 uM forward and 0.625 uM reverse 
Igk primers, and 0.8 M forward and 0.625 uM 
reverse IgA primers. The template input was 
4 ul of cDNA, including carrying forward 4 ul 
of each negative control from wells 12E through 
12H. The cDNA plates were briefly vortexed 
and centrifuged before template addition. Ther- 
mal cycling parameters were 95°C for 5 min 
and 50 cycles of 95°C for 30 s, either 52°C (IgH) 
or 58°C (Ig« and Ig) for 30 s and 72°C for 55 s, 
followed by a final extension of 72°C for 10 min. 
Once complete, plates were frozen at —20°C. 
An additional nested PCR was performed 
for all chains to increase specificity. The reac- 
tion mix consisted of 1 U Phusion High-Fidelity 
DNA Polymerase and 1x Phusion High-Fidelity 
Reaction Buffer (New England Biolabs; Ipswich, 
MA, USA), as well as 0.2 mM dNTP mix and 
1x Q Solution (Qiagen; Hilden, Germany). Each 
chain was again amplified using a chain-specific 
pool of forward primers and isotype-specific 
reverse primers, with one reaction mix each 
containing 5.5 uM forward and 1.0 nM IgH 
reverse primers, 4.0 M forward and 5.0 uM 
reverse Ig« primers, and 3.5 uM forward and 
5.0 M reverse Iga primers (tables S14, S16, 
and S18). The template input was 4 ul of chain- 
specific PCR product, again including carrying 
forward 4 ul of each negative control from 
wells 12E through 12H. In addition, chain- 
specific positive PCR controls (synthesized 
by IDT) were included in wells 12A through 
12D. These synthetic constructs were added 
at 400 copies per well. The PCR plates, in- 
cluding the positive PCR controls, were briefly 
vortexed and centrifuged before template addi- 
tion. Thermal cycling parameters were 98°C for 
30 s and 35 cycles of 98°C for 30 s, either 58°C 
(gH and Ig) or 52°C (Ig) for 30 s and 72°C 
for 55 s, followed by a final extension of 72°C for 
10 min. Once complete, the plates were fro- 
zen at —-20°C and shipped to Genewiz (South 
Plainfield, NJ) on dry ice. Genewiz conducted 
an enzymatic clean-up on the PCR products 
before conducting uni-directional Sanger se- 
quencing using the reverse chain-specific con- 
stant region primers (3’Cg CH1, 3’CK 494, and 
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3'CL shown in tables S14, S16, and S18). Se- 
quencing data were obtained for a total of 
3186 PCR plates (six times the total number 
of cDNA plates), not accounting for repeats. 

Quality control measures were taken for all 
sequence data to determine if sequencing of 
specific plates needed to be repeated. When 
PCR controls were not available, we required a 
minimum of 50% of the sequences on any 
plate to have identifiable and functional B cell 
receptor sequences. When PCR controls were 
available, we required >50% of the positive 
controls to have identifiable control sequences 
(two to four) and the negative controls to have 
<50% of the wells with identifiable sequences 
(one or zero) (VRC) or we required the nega- 
tive controls to have <50% of the wells with 
identifiable sequences (one or zero) (FHCC). 
Sequences quality was first checked with the 
Genewiz quality report to determine overall 
sequencing success for the plate. Sequences 
were then analyzed by IMGT/V-QUEST for 
productive B cell receptor sequences or se- 
quences with identifiable B cell receptor gene 
usage. When synthetic PCR controls were 
used, all sequences were screened for con- 
tamination by the spiking of the controls using 
the CDR3 amino acid junction sequence tags 
mentioned previously. All data was trans- 
ferred to the Vaccine Immunology Statistical 
Center (VISC) for data repository and analy- 
sis. However, sequencing plates that did not 
pass our criteria were investigated further 
and either sequenced again directly by Genewiz 
or PCR amplified from an earlier step with re- 
maining cDNA or PCR material. The new plates 
were submitted to Genewiz for sequencing, 
and new sequence data was identified with an 
increased “Replicate” number. 


Additional PCR and NGS BCR sequencing for the 
low-dose group 


To control for potential sequencing errors in 
the unidirectional/single-read Sanger sequenc- 
ing data, we resequenced selected samples 
from the low dose group using next genera- 
tion sequencing (NGS). To perform NGS on 
selected cells, cDNA generated for Sanger 
sequencing was used as the starting point. 
Plates were removed from —80°C storage and 
centrifuged at 800 x g for 1 min before being 
placed on wet ice. Selected wells from different 
plates were collected into a new plate, and 
PCR was performed using the same nested 
method used for Sanger sequencing. The 
nested PCR product from the individual heavy 
and light chains were pooled in equal volumes 
of 10 ul from each plate into a single plate. The 
pooled plate was purified using AmpureXP 
beads (Beckman coulter Cat: A63881) in the 
ratio of 0.8x beads to PCR product for a total 
volume of 24: ul of beads to 30 ul of pooled PCR 
product. The purified product was eluted into 
20 ul of RNase-free water. Each purified plate 
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was quantified using the QuantIT dsDNA 
high sensitivity kit (Thermofisher Cat: Q33120) 
and normalized to 0.2ng/ul for optimal NGS 
library generation using the Illumina Nextera 
XT kit (INumina Cat: FC1-31-1096). The nor- 
malized plates were processed for NGS library 
preparation using 1.25 ul of the product with 
2.5 ul of Tagmentation buffer and 1.25 ul of the 
Amplicon Tagmentation mix provided in the 
Nextera kit. The plates were centrifuged at 
800 x g for 1 min and placed in the thermo- 
cycler at 55°C for 10 min. Following the in- 
cubation, 1.25 ul of neutralization buffer was 
added to the wells and incubated at room 
temperature for 5 min. The tagmented product 
was indexed using custom primers generated 
to multiplex 10 plates together in a single se- 
quencing run. Each plate was indexed with 
1.25 ul of a unique 10-base pair Mi7 index at 
a concentration of 10 uM and each well of the 
plate was indexed with 1.25 ul of a unique 
10 base pair Mi5 index at a concentration of 
10 uM, along with 3.75 ul of the NPM reaction 
mix provided in the Nextera XT kit. The PCR 
plates were centrifuged at 800 x g for 1 min 
and placed in the thermocycler for indexing. 
The thermocycler parameters used were 72°C 
for 3 min, 12 cycles of 95°C for 30 s, 50°C for 
30 s, and 72°C for 1 min, followed by a final 
extension of 72°C for 5 min. The indexed 
libraries were further PCR purified using the 
AmpureXP beads at 0.8x concentration and 
eluted into 10 ul of RNase-free water. The pu- 
rified product from 10 indexed plates was 
pooled together for loading onto the INumina 
Miseq for sequencing. The final concentration 
of the pooled library was determined using 
the Kapa QPCR kit (Roche Cat: KK4835) and 
diluted to 4 nM for sample denaturating ac- 
cording to the Illumina loading guidelines. 
The denatured sample was further diluted to 
10 pM concentration for loading onto the Miseq 
V3 600 cycle kit (lumina cat: MS-102-3003). 
Demultiplexed fastq files were retrieved from 
the Miseq and aligned to immunoglobulin 
genes using BALDR (78). 

High quality NGS sequences with at least 
100 Nextera reads were obtained for a total 
of 1998 samples and were regarded as per- 
fectly accurate. Comparison of high quality 
NGS sequences to corresponding Sanger se- 
quences revealed perfect agreement for 82.9% 
of Sanger reads at the nucleotide level and 
85.9% of Sanger reads at the amino acid level. 
There were differences of 1, 2, 3, 4, or 5 nu- 
cleotides for 7.8, 2.3, 1.3, 0.65, or 0.75% of 
Sanger reads, respectively, and differences 
of 1, 2, 3, 4, or 5 amino acids for 6.7, 1.8, 1.3, 
0.70, or 0.50% of Sanger reads, respectively. 
Hence, the NGS data confirmed that >94.2% 
of the Sanger sequencing reads were accurate 
to within three nucleotides, and >95.6% of 
the Sanger sequencing reads were accurate 
to within three amino acids. We concluded 
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that it was not necessary to carry out NGS 
resequencing on the remaining low dose sam- 
ples or on any of the high dose samples. Never- 
theless, a small fraction (<4.4%) of Sanger 
reads differed by more than three amino acids 
from the corresponding high quality NGS se- 
quences. High quality NGS sequences were 
used in place of the corresponding Sanger 
sequences for downstream analysis in cases 
where Sanger and NGS differed by 1 to 19 nu- 
cleotides, as described in the NGS module of 
the BCR bioinformatic analysis pipeline below. 


Sample identify confirmation for one case of 
accidental sample mislabeling 


Comparison of serum antibody binding data 
from BAMA with antigen-specific B cell fre- 
quency data from cytometry suggested that 
samples from visits 8 and 10 (weeks 10 and 16) 
for placebo-recipient PubID_164 might have 
been accidentally swapped with samples from 
the same time points for vaccine-recipient 
PubID_153 during flow cytometry analysis. 
Short tandem repeat (STR) analysis along with 
confirmatory flow cytometry testing was used 
to confirm the identity of the samples. Dup- 
licate samples were thawed and divided. Small 
aliquots were lysed and transported to two 
independent facilities for STR testing to con- 
firm donor sample identity. Remaining sam- 
ples were processed by trial flow cytometry 
protocols for comparison of cellular pheno- 
typic data. Both STR testing facilities were in 
agreement with each other and the confirma- 
tory flow cytometry data to confirm sample 
identities. This analysis confirmed that a sam- 
ple labeling error of transposing the partici- 
pant IDs had occurred during analysis by flow 
cytometry, effectively swapping the samples for 
the two participants at the time points listed 
above. The sample identity confirmation pro- 
vided justification for the bioinformatic pipe- 
line to correct for the effective “sample swap,” 
as described in the SWAP module below. 


Bioinformatic BCR sequence processing 
Database 


ABI files containing DNA sequence informa- 
tion from Sanger sequencing were organized 
in a central database at VISC in which each 
file was regarded as an entry in the database. 
The files were contained within folders such 
that each folder represented a 96-well plate. 
The folder and file naming structure was: 

<Date Deposited>/<Plate Tracking Number>/ 
<Participant ID>_<Visit Number>_<Plate 
Number>_<Round>_<Chain>_<Replicate>_ 
<Well>.abi 

Date Deposited was the date the ABI filed 
was deposited into the database. Plate Track- 
ing Number was the FedEx tracking number 
associated with shipping the plate to Genewiz. 
Participant ID was the unique identifier as- 
signed to each participant in the trial. Visit 
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Number was the visit number as described 
in fig. S1 and table S2. Plate number was the 
unique number assigned to each 96-well plate 
sorted for each participant at each visit. Round 
was could be either 1 or 2, indicating the first 
or second of the nested PCR reactions; all sam- 
ples submitted for sequencing were round 2. 
Chain was heavy, kappa, or lambda. Replicate 
was the unique index to indicate which one of 
several potential attempts at nested PCR and 
sequencing were made for this plate. Well was 
the unique number for each well in a 96-well 
plate. The database contained a total of 158,954 
ABI files corresponding to 1724 96-well plates, 
some of which were only partially filled. The 
database also contained a sample manifest 
that linked metadata to each well for which 
sequencing data was obtained. The data fields 
in the manifest included several important 
boolean variables, such as (i) Has_sorted_cell, 
which was true if the well contained a sorted 
cell; (ii) Is_negative_control, which was true if 
the well contained a negative control sample; 
ii) Is_positive_control, which was true if the 
well contained a positive control sample; and 
(iv) Is_doublet, which was true if the well was 
known from FACS to contain a doublet (two 
sorted cells); and many other metadata about 
the sample (the sample manifest is available 
in the data repository; see Data and materials 
availability below). Sequences obtained by 
NGS for a small subset of samples from the 
low dose group were also deposited into the 
central database in a spreadsheet linking each 
NGS sequence to a unique sequence identifier 
that could be matched to the Sanger sequence 
identifiers. NGS sequences were incorporated 
into the data analysis as described below. 


BCR analysis pipeline 


Preliminary analyses of BCR sequences were 
carried out during the trial, as the sequencing 
data was generated and deposited into the 
central database. This was necessary to obtain 
early readouts on the efficacy of the vaccine in 
order to help guide decision-making about 
additional preclinical and clinical experiments 
to build on the findings from this trial. The 
preliminary analyses also enabled us to select 
BCR sequences for production as soluble IgGs 
and subsequent characterization of binding 
affinities by SPR. Initially, and during most 
of the trial, the preliminary analysis pipeline 
employed Abstar [https://github.com/briney/ 
abstar; (79)] for BCR gene assignment and an- 
notation, coupled with custom scripts written 
in various programming languages for analy- 
sis. However, toward the end of the trial, and 
for the final analysis presented in this man- 
uscript, we developed a more streamlined, 
adaptable, and transparent code base for 
analysis. The final, improved analysis pipeline 
used Sequencing Analysis and Data library 
for Immunoinformatics Exploration (SADIE, 
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https://github.com/jwillisO720/sadie) for BCR 
gene assignment and annotation, coupled with 
a series of python modules, and reported BCR 
sequence characteristics in the standardized 
Adaptive Immune Receptor Repertoire (AIRR) 
format (80). 

The final BCR analysis pipeline (fig. S10), a 
single installable python package, consisted 
of 11 independent modules that read as input 
the dataframe (a data structure that organizes 
tabular data) from the previous step and pro- 
duced as output a new dataframe. The modules 
(FIND, MODEL, SPLIT, CORRECT, ANNOTATE, 
JOIN, NGS, TAG, SWAP, PAIR, PERSONALIZE, 
and MUTATION) are described below. 

FIND: The FIND module collected infor- 
mation from all ABI files in the central data 
repository. The filename, file path, nucleotide 
sequence, and Phred score (8/, 82) were stored 
into a dataframe for subsequent analysis. A 
total of 158,954 ABI files were found. 

MODEL: The MODEL module tokenized the 
filename and path into expected metadata 
fields (e.g., verified that participant id was in 
our list of expected participant ids) and en- 
forced consistent formatting for each field 
(e.g., repO, or rO became REPO for the rep- 
licate field). 

SPLIT: The SPLIT module dropped entries 
that were duplicates or that had incomplete 
metadata fields assessed in the MODEL mod- 
ule. An entry was determined to be a duplicate 
if it contained the same metadata fields as well 
as the same nucleotide sequence and Phred 
score. Such duplicate entries resulted from 
duplicate manual uploads to the central data 
repository. After the SPLIT module, a total 
of 155,801 entries were available for subse- 
quent analysis. 

CORRECT: The CORRECT module applied 
a curated list of updates to meta data fields 
and drops of certain sequences or plates that 
were requested by the FHCC and VRC ex- 
perimental labs. Updates were generally correc- 
tions to manual file-naming errors that violated 
the predefined naming format (common errors 
including entering “NA” instead of a valid field 
entry, or specifying the visit number with an 
invalid format). Drops were mostly either to 
eliminate wells for which no cell was known 
to be sorted, or to eliminate plates that had 
been named incorrectly and were resubmitted 
separately with correct naming. A total of 
1318 entries were updated, and 701 entries 
were dropped. A full list of updates and drops 
was stored in JSON format and can be reviewed 
in the GitHub repository. The CORRECT 
module then grouped all entries by participant 
ID, time point, plate, well, replicate, and chain, 
to ensure uniqueness. After the CORRECT 
module, a total of 155,100 entries were avail- 
able for subsequent analysis. 

ANNOTATE: The ANNOTATE module used 
the Sequencing Analysis and Data library 
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for Immunoinformatics Exploration (SADIE, 
https://github.com/jwillis0720/sadie) to anno- 
tate BCR sequences. The SADIE AIRR module 
ports IgBLAST (83) for analysis of nucleotide 
sequences and ensures that the data are rep- 
resented in AIRR recombination schema that 
defines a data model, field names, data types, 
and encodings (80). In addition, SADIE AIRR 
provides a versioned and tested python pack- 
age that includes the IgBLAST executable and 
allows automatic creation of custom IgBLAST 
databases, a process that is otherwise difficult. 
SADIE AIRR provides the user with scriptable, 
granular control over IgBLAST options via a 
python API or a command-line interface. SADIE 
AIRR will try to determine optimized V(D)J 
alignment penalties to find a productive recom- 
bination (adaptable penalty model) and will 
also correct insertions and deletions that are 
absent in the germline alignment using current 
IgBLAST (v 1.17.1). For analysis of the GO01 light 
chain sequences, SADIE AIRR implementation 
of the adaptable penalty model was helpful for 
determining likely germline rearrangements 
for light chains with short LCDR3s with sig- 
nificant VJ gene overlap. In these cases, the use 
of adaptable penalties facilitated identification 
of likely boundaries for the V and J genes, 
which resulted in identification of a slightly 
increased number of productive recombina- 
tions. The adaptable penalty model started 
with the default value of -1 for both V and J 
gene penalties, and in that configuration iden- 
tified 27,913 productive antibody light chain 
recombinations. Other (V, J) penalty combi- 
nations, of (—2, -1), (-1, -3), and (-3, -1), iden- 
tified 683, 65, and 37 additional light chain 
recombinations, respectively, increasing the 
total number of light chain recombinations by 
2.8%. The Annotate module joined all AIRR 
recombination data fields with the previous 
metadata fields for subsequent analysis. In 
addition, ANNOTATE calculated a “sliced” 
Phred score that was the mean Phred score 
computed over just the V(D)J portion of the 
sequence. This “sliced” Phred score was used 
in evaluation of sequence quality in subsequent 
modules. After the ANNOTATE module, a total 
of 155,100 entries were available for subsequent 
analysis. 

JOIN: As noted above, a manifest contain- 
ing additional information about each well in 
each 96-well plate subjected to DNA sequenc- 
ing was uploaded to the central data reposi- 
tory. The data fields in the manifest included 
the tokenized metadata from the FIND mod- 
ule as well as additional information. The JOIN 
module merged the manifest data with the an- 
notated sequencing data from the ANNOTATE 
module. There were 22,707 entries with se- 
quence information that could not be found 
in the manifest, of which only 633 were pro- 
ductive antibody sequences, and JOIN was 
unable to process those entries. After the JOIN 
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module, 132,393 entries were available for 
subsequent analysis. 

NGS: The NGS module took the output 
from the resequencing pipeline, reannotated 
the sequences using SADIE AIRR, and up- 
dated the sequences and annotation fields 
(substituted the NGS sequences and their 
associated annotations in place of the corre- 
sponding Sanger sequences and annotations) 
if there were (i) fewer than 20 nucleotide dif- 
ferences as measured by Levenshtein distance 
(84) between the Sanger and NGS; and (ii) 
more than 100 V(D)J reads were produced 
by the NGS pipeline. When resequencing was 
carried out, the original Sanger sequence, the 
corrected sequence, and the Levenshtein dis- 
tance were added to the dataframe for the 
subsequent analysis. Of 1998 high quality NGS 
sequences, 1656 (82.8%) sequences perfectly 
matched the corresponding Sanger sequence; 
37 (1.9%) differed from the corresponding 
Sanger sequence by more than 20 nucleotides 
and were not considered for correcting the 
Sanger sequence; and 304 (15.2%) differed 
from the corresponding Sanger sequence by 
1 to 19 nucleotides and were used to correct 
the Sanger sequence. 

TAG: The tag module searched the produc- 
tive sequences and performed a sequence 
alignment against the positive control se- 
quences used in the PCR reaction, Synth or 
Immo (described in PCR and sequencing pos- 
itive and negative controls). A positive match 
to a Synth or Immo heavy chain required that 
the HCDR3 nucleotide normalized identity 
to the true sequence was greater than 0.95 and 
the normalized V;, gene distance to the true 
sequence was greater than 0.9. For kappa and 
lambda matches to Synth, the LCDR3 and V;, 
gene distances to the true sequence were both 
required to be greater than 0.9. For kappa and 
lambda chain matches to Immo, the LCDR3 
nucleotide normalized distance to the true 
sequence was required to be greater than 0.92; 
and the normalized distance to the true Vx 
gene sequence was required to be greater than 
0.9. In addition, for the kappa chain matches 
to Immo, an exact match was required to the 
n-addition nucleotide sequence (CG). These 
parameters were calculated by comparing 
control sequences from NGS with control se- 
quences from Sanger and adjusting the nor- 
malized distances until the controls were all 
identified by their Sanger sequences. 

SWAP: The SWAP module transposed all 
data in the current dataframe from participant 
PubID_153 at visits 8 and 10 with participant 
PubID_164. See Sample identify confirmation 
for one case of accidental sample mislabeling 
for additional details. 

PAIR: The pair module applied a series of 
filters to produce a final set of high quality 
BCR sequences including both heavy and light 
chains. The first filter eliminated nonfinal 
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sequences, referred to as nonterminal repli- 
cates: for plates in which Sanger sequencing 
was attempted two or three times until the 
sequencing results passed quality controls 
(“Single-cell sequencing”), only the sequences 
from the final, or terminal, run were used. This 
filter eliminated 8966 nonterminal replicate 
entries. The second filter eliminated positive 
control sequences that were either derived 
from a well labeled as a control in the sample 
manifest (nm = 6898) or were identified from 
the TAG module (n = 2746). The third filter 
removed sequences that either had no anti- 
body recombination detected by SADIE AIRR 
(n = 2536) or corresponded to negative con- 
trol wells in the manifest (7 = 5607). The fourth 
filter removed 1677 heavy or light chain se- 
quences identified in the manifest as doublets 
(cases of two cells sorted into a single well). 
The fifth filter removed 63,036 unproductive 
or incomplete V(D)J sequences that either con- 
tained a stop codon (unproductive; n = 61,374) 
or contained a recombination that did not 
start at the first nucleotide of framework 1 and 
end at the last nucleotide of framework 4 
(incomplete V(D)J; n = 1662). Of the incom- 
plete V(D)J sequences, 505 were heavy chains 
less than 90 amino acids in length, and 836 
were light chains less than 80 amino acids in 
length. The sixth filter removed 1965 entries 
with mean Phred scores over the V(D)J se- 
quence (“sliced” Phred scores) less than 30. 
The remaining 41,887 sequences, which in- 
cluded both heavy and light chains, were con- 
sidered candidates for pairing. Sequences 
were then grouped by participant id, time 
point, plate and well. In the seventh and final 
filter, groups containing more than one heavy 
or light chain were eliminated (nm = 2033), and 
unpaired heavy or light chains (n = 12,102) 
were eliminated. The remaining groups were 
considered proper pairs, because they con- 
tained exactly one heavy chain and one light 
chain. This process yielded 13,876 total BCR 
sequences with paired heavy and light chains 
(9862 heavy/kappa; 4014 heavy/lambda). Of 
those, 11,372 were CD4bs-specific (sorted as 
KO GT8**) and 2504 were sorted as either GC B 
cell phenotype (from the first two plates in each 
FNA sort) or plasmablast phenotype (from the 
first two plates in each PB sort) without regard 
to eOD-GTS8 tetramer binding (see FACS and 
fig. S10). Among the 2504 BCRs sorted by pheno- 
type only, 71 (2.8%) were VRCO1-class, and 45 of 
those were produced by a single donor at a sin- 
gle time point (PubID 151 at week 3). We focused 
our subsequent analyses on the 11,372 BCRs 
sorted as CD4bs-specific. 

PERSONALIZE: The personalize module 
reannotated all BCR sequences with the per- 
sonalized IGHV1-2 allele haplotype discovered 
in the genotype analysis IGHV1-2 genotype 
analysis using IgDiscover). SADIE AIRR was 
run against a personalized reference set curated 
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for each participant using the SADIE reference 
module. This allowed annotation only against 
the correct haplotype of IGHV1-2 alleles and 
thus a more accurate Vj; somatic mutation 
percentage assignment. 

MUTATION: The mutation module used 
SADIE ANARCI to number each VDJ amino 
acid sequence using the kabat numbering 
scheme. SADIE ANARCT is a port of the ANARCI 
program (85) that numbers amino acid se- 
quences in a variety of numbering schemes. 
Both the germline V(D)J and mature V(D)J se- 
quences were numbered in the Kabat scheme. 
The mutations were recorded and added to the 
final output dataframe. 


Computing VRCOl1-class frequencies by 
combining frequencies from FACS 

and BCR sequencing 

Memory B cell data 


The frequency of VRCO1-class IgG memory 
B cells was estimated by multiplying two 
frequencies: (i) the frequency of CD4bs-specific 
(KO GT8**) IgG memory B cells among all 
IgG memory B cells sorted by FACS; and (ii) 
the frequency of VRCO1-class BCRs among 
all CD4bs-specific IgG Memory B cells with 
sequenced BCRs. If the frequency of CD4bs- 
specific IgG memory B cells was zero, then 
the estimate for the VRCOl1-class frequency 
was also set to zero. In the case where the 
frequency of CD4bs-specific IgG memory 
B cells was positive but no CD4bs-specific IgG 
memory B cells were successfully sequenced, 
the estimate for the VRCOl1-class frequency 
was set to zero for postvaccination samples 
or the CD4bs-specific frequency for baseline 
samples. This approach is conservative or de- 
tecting vaccine-induced responses. 
Additionally, based on the data for IgG mem- 
ory B cells from PBMC samples, we estimated 
the frequencies of VRCO1-class B cells among 
three different populations of B cells in the 
periphery. First, as an estimate for the frequen- 
cy of VRCO1-class B cells among CD4bs-specific 
IgG memory B cells, we used the frequency 
of VRCO1-class B cells among BCR-sequenced 
CD4bs-specific IgG memory B cells. Second, to 
obtain an estimate for the frequency of VRCO1- 
class B cells among GT8-specific IgG memory 
B cells, we multiplied the frequency of VRCO1- 
class B cells among CD4bs-specific IgG mem- 
ory B cells by the ratio of CD4bs-specific IgG 
memory B cells to GT8-specific IgG memory 
B cells. Third, to obtain an estimate for the 
frequency of VRCO1-class IgG memory B cells 
among all B cells, we multiplied the frequency 
of VRCO1-class B cells among CD4bs-specific 
IgG memory B cells by the ratio of CD4bs- 
specific IgG memory B cells to all B cells. 


GC B cell data 


The frequency of VRCO1-class IgG GC B cells 
was estimated similarly as for memory B cells, 
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by multiplying two frequencies: (i) the fre- 
quency of CD4bs-specific IgG GC B cells among 
all IgG GC B cells sorted by FACS; and (ii) the 
frequency of VRCO1-class BCRs among all 
CD4bs-specific IgG GC B cells with sequenced 
BCRs. If the frequency of CD4bs-specific IgG 
GC B cells was zero, then the estimate for 
the VRCOl1-class frequency was also set to 
zero. If the frequency of CD4bs-specific IgG 
GC B cells was positive but no CD4bs-specific 
IgG GC B cells were successfully sequenced, 
the estimate for the VRCO1-class frequency 
was Set to zero. 

Analogous to procedures we followed with 
IgG memory B cells from PBMCs, based on the 
data for IgG GC B cells from FNA samples, we 
estimated the frequencies of VRCO1-class 
B cells among three additional populations of 
B cells from FNA samples. First, as an estimate 
for the frequency of VRCO1-class B cells among 
CD4bs-specific IgG GC B cells, we used the 
frequency of VRCO1-class B cells among BCR- 
sequenced CD4bs-specific IgG GC B cells. Sec- 
ond, to obtain an estimate for the frequency 
of VRCO1-class B cells among GT8-specific 
IgG GC B cells, we multiplied the frequency 
of VRCO1-class B cells among CD4bs-specific 
IgG GC B cells by the ratio of CD4bs-specific 
IgG GC B cells to GT8-specific IgG GC B cells. 
Third, to obtain an estimate for the frequency 
of VRCO1-class IgG GC B cells among all B cells, 
we multiplied the frequency of VRCO1-class 
B cells among CD4bs-specific IgG GC B cells 
by the ratio of CD4bs-specific IgG GC B cells 
to all B cells. 


Plasmablast data 


The frequency of VRCO1-class IgD™ plasma- 
blasts (PBs) was also estimated similarly as 
for memory B cells, by multiplying two fre- 
quencies: (i) the frequency of CD4bs-specific 
IgD PBs among all IgD” PBs sorted by FACS; 
and (ii) the frequency of VRCO1-class BCRs 
among all CD4bs-specific IgD” PBs with se- 
quenced BCRs. If the frequency of CD4bs- 
specific IgD” PBs was zero, then the estimate 
for the VRCO1-class frequency was also set to 
zero. If the frequency of CD4bs-specific IgD™ 
PBs was positive but no CD4bs-specific IgD™ 
PBs were successfully sequenced, the estimate 
for the VRCO1-class frequency was set to zero. 

Analogous to procedures we followed with 
memory B cells from PBMCs and GC B cells 
from FNAs, based on the data for IgD” PBs 
from week 9 PBMC samples, we estimated 
the frequencies of VRCO1-class B cells among 
three additional populations of B cells in the 
periphery at week 9. First, as an estimate for 
the frequency of VRCO1-class PBs among CD4bs- 
specific IgD” PBs, we used the frequency of 
VRCO1-class PBs among BCR-sequenced CD4bs- 
specific IgD” PBs. Second, to obtain an estimate 
for the frequency of VRCO1-class PBs among 
GT8-specific IgD” PBs, we multiplied the 
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frequency of VRCO1-class PBs among CD4bs- 
specific IgD” PBs by the ratio of CD4bs-specific 
IgD” PBs to GT8-specific IgD” PBs. Third, to 
obtain an estimate for the frequency of VRCOI- 
class IgD” PBs among all B cells, we multi- 
plied the frequency of VRCO1-class cells among 
CD4bs-specific IgD PBs by the ratio of CD4bs- 
specific IgD” PBs to all B cells. 


Key B cell frequencies reported 


B cell frequencies and B cell receptor (BCR) 
signatures were measured by flow cytometry 
(FACS) and BCR sequencing, respectively. In 
this section we list the key frequencies re- 
ported in the manuscript and/or in the sup- 
plementary data files. Additional frequencies 
not listed here are provided in the supple- 
mentary data files. Methods to estimate fre- 
quencies based on combined cytometry and 
BCR sequence analyses are given in the pre- 
ceding section, Computing VRCO1-class fre- 
quencies by combining frequencies from FACS 
and BCR sequencing. 

The key frequencies reported for PBMC sam- 
ples (weeks —4, 4, 8, 10, 16) were as follows. 

From cytometry analysis: 

¢ Percent of IgG memory B cells that were 
GT8s** (regardless of KO binding status) 

¢ Percent of GT8** IgG memory B cells that 
were KO™ 

¢ Percent of IgG memory B cells that were 
CD4bs-specific 

From combined cytometry and BCR sequence 
analysis: 

¢ Percent of CD4bs-specific IgG memory 
B cells that were VRCO1-class 

¢ Percent of GT8-specific IgG memory B cells 
detected as VRCO1-class 

¢ Percent of IgG memory B cells detected as 
VRCO01-class 

¢ Percent of B cells detected as VRCO1-class 

The key frequencies reported for FNA sam- 
ples (weeks 3 and 11) were as follows. 

From cytometry analysis: 

Percent of IgG GC B cells that were GT8** 
(regardless of KO binding status) 

¢ Percent of GT8** IgG GC B cells that 
were KO 

¢ Percent of IgG GC B cells that were CD4bs- 
specific 

From combined cytometry and BCR se- 
quence analysis: 

¢ Percent of CD4bs-specific IgG GC B cells 
that were VRCO1-class 

¢ Percent of GT8-specific IgG GC B cells 
detected as VRCO1-class 

¢ Percent of IgG GC B cells detected as 
VRCO1-class 

¢ Percent of B cells detected as VRCO1-class 

The key frequencies reported for PB sam- 
ples (week 9) were as follows. 

From cytometry analysis: 

¢ Percent of IgD™ PBs that were GT8** (re- 
gardless of KO binding status) 
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¢ Percent of GT8** IgD™ PBs that were KO™ 

¢ Percent of IgD” PBs that were CD4bs- 
specific 

From combined cytometry and BCR sequence 
analysis: 

¢ Percent of CD4bs-specific IgD” PBs that 
were VRCO1-class 

¢ Percent of GT8-specific IgD” PBs detected 
as VRCO1-class 

¢ Percent of IgD” PBs detected as VRCO1- 
class 

¢ Percent of B cells detected as VRCO1-class 


VRCO1-class response calls 


The criteria for determining if detection of 
one or more VRCO1-class B cells in a sample 
represented a vaccine-induced VRCO1-class 
response were as follows. 


Memory B cell data 


For each post-baseline visit in which mem- 
ory B cells from PBMCs were sorted (weeks 
4, 8, 10, 16), detection of one or more VRCO1- 
class B cells was labeled a vaccine-induced 
VRCOl1-class response if the estimated fre- 
quency of VRCO1-class IgG memory B cells 
among all IgG memory B cells was greater 
than the corresponding baseline frequency. 
For participants with no sequences of CD4bs- 
specific memory B cells available at baseline, 
the baseline frequency was estimated as the 
frequency of CD4bs-specific (KO GT8**) IgG 
memory B cells, which was a conservative 
approach. 


GC B cell data 


For FNA samples (weeks 3 and 11), for which 
baseline data were not available, detection of 
one or more VRCO1-class B cells was labeled a 
vaccine-induced VRCO1-class response if the 
estimated frequency of VRCO1-class IgG GC 
B cells among all IgG GC B cells was greater 
than 0.1% (1 in 1000). This was an arbitrary 
threshold, but we judged it to be reasonable 
considering that (i) none of the placebos pro- 
duced any detectable VRCO1-class B cells in 
FNA samples, and (ii) 0.1% represents a sub- 
stantial frequency for a single class of B cells 
(with specific BCR properties) within a poly- 
clonal response. 


Plasmablast data 


For PB samples (week 9), for which baseline 
data were also not available, detection of one 
or more VRCO1-class plasmablasts was labeled 
a vaccine-induced VRCO1-class response if the 
estimated frequency of VRCO1-class IgD” PBs 
among allIgD™ PB cells was greater than 0.1% 
(the same numerical threshold used for test- 
ing FNA samples). We judged this to be a 
reasonable threshold, because (i) none of the 
placebos produced any detectable VRCO1-class 
B cells in PB samples, and (ii) as noted above, 
0.1% represents a substantial frequency for 
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a single class of B cells (with specific BCR 
properties) within a polyclonal response. 


Pretrial evaluation of baseline VRCO1-class IgG 
B cell frequency 


During pretrial development of the B cell sort- 
ing and BCR sequencing assay, we sought to 
determine an estimate for the baseline fre- 
quency of VRCO1-class IgG B cells in humans. 
We applied the assay to leukapheresis samples 
from 35 healthy, HIV-unexposed donors, in- 
cluding approximately 12 billion PBMCs and 
approximately 29.2 million IgG* memory 
B cells. From 142 BCRs with HC and LC se- 
quenced out of 171 CD4bs-specific cells sorted, 
we detected no VRCO1-class B cells, indicating 
a relatively low baseline frequency of VRCO1- 
class B cells in the human IgG memory B cell 
repertoire. In contrast, we verified by sorting 
of human naive IgM+ B cells that the same 
assay isolated VRCO1-class human naive pre- 
cursors at frequencies similar to those reported 
previously (not shown). 


IGHV1-2 genotype analysis using IgDiscover 


Genotyping of the G001 participants was per- 
formed under an ethics approval from the 
National Ethical Review Agency of Sweden 
(decision no. 2021-01850). Total RNA was iso- 
lated from frozen PBMC samples prepared 
from each participant (n = 48) using the Qiagen 
RNeasy kit and protocol. Two hundred nano- 
grams of total RNA was used for cDNA syn- 
thesis with an IgM constant region primer that 
contained a Unique Molecular Identifier (UMI), 
according to the procedures described pre- 
viously (86). Two independent 5’ multiplex 
libraries (L and U) were produced for each 
case using either a IGHV leader region specific 
primer set (L) or a primer set that targeted 
the 5’ UTR (U). The libraries were individu- 
ally indexed and sequenced on the Illumina 
MiSeq platform using the V3 2 x 300 cycle 
kits IHumina). The Read 1 and Read 2 files 
for each library were processed using the 
IgDiscover analysis pipeline as described 
previously (86, 87). The IgDiscover program 
(version 0.12.4.dev266+ge9e3119) enabled the 
analysis of IGHV1-2 allele content in each case, 
using the two independent libraries for high 
confidence allele identification. 


BCR sequence hierarchical clustering 


Average linkage agglomerative clustering (Fig. 3 
and figs. S15, S17, S18, S22, and S23) was performed 
using the SADIE cluster module. The module 
uses scikit-learn (88) to perform clustering 
specifically on BCR sequences with the AIRR 
annotation schema. Sequences were split into 
VRCOl-class and non-VRCO1-class and then 
grouped by HCDR3 and LCDR3 length. For 
each subgroup, the distance between two anti- 
bodies was the sum of Levenshtein distances 
between the corresponding HCDRs and LCDRs 
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(HCDR1 distance + HCDR2 distance + HCDR3 
distance + LCDR1 distance + LCDR2 distance + 
LCDR3 distance). To facilliate clustering of 
both mutated and unmutated BCRs with a 
single distance cutoff, the “somatic pad” op- 
tion was used to subtract all common somatic 
mutations from the total Levenshtein distance 
for any pair of BCRs (89). With this option, the 
minimal distance between two BCRs was zero. 
The sparse upper-triangular distance matrix 
was then saved into memory. Clustering was 
then performed using scikit-learn with the 
precomputed distance matrix using average- 
linkage clustering and a distance cutoff of 3. 
The cluster labels were extracted and added 
to the output dataframe. The centroid of every 
cluster was computed as the lowest mean dis- 
tance to every other member in the cluster. 
The all-vs-all, intracluster, and intercentroid 
distances were then computed from the saved 
distance matrix. 


Bioinformatic BCR sequence analysis other than 
VRCO1-calls or clustering 


BCR V gene assignments and mutation levels 
were determined using SADIE, which was also 
used in the ANNOTATION module described 
above. Mutation analysis for VH1-2 genes was 
made more accurate by accounting for the 
personalized IGHV1-2 genotypes as described 
in the PERSONALIZE module above. Statistical 
quantile analyses throughout the manuscript 
were carried out using R and Pandas (https:// 
github.com/pandas-dev/pandas). Confidence 
intervals for the non-VRCO01-class response rate 
(fig. SI9C) were computed using the Wilson 
score method (90). Statistical comparisons for 
BCR mutation levels at different time points 
(Fig. 5, fig. S20, and tables S46 to S49), were 
carried out using a Wilcoxon signed-rank 
test for paired data using the VISCfunctions 
R package (https://github.com/FredHutch/ 
VISCfunctions). Analyses of VRCO1-class BCR 
features (Fig. 6 and figs. S16 and S24 to S29) 
were carried out by custom python functions 
available as a package in the data repository 
for GOO1 (https://github.com/SchiefLab/G001). 
We detected few recurring VRCO1-class clones 
at different time points, and therefore we 
did not carry out analyses of intraclonal 
SHM over time as have been reported else- 
where (64). Inferred germline amino acid 
sequences were computed by reverting tem- 
plated V, D, and J gene segments to their 
germline sequences for the alleles predicted 
on the basis of the antibody nucleotide se- 
quence; in this process, VH1-2 allele predic- 
tions were made more accurate by restricting 
the allowed alleles for each participant to 
those experimentally determined by VH1-2 
genotyping for that participant. When re- 
ferring to control distributions from OAS 
(91, 92), we restricted to human, nonvacci- 
nated, no disease state data from OAS. In 
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data we report from OAS, each symbol refers 
to a separate individual. 


Antibody selection and production for SPR 


For IgG expression, we selected at least two 
VRCO1-class and one non-VRCO1-class BCR 
from each participant at each postvaccina- 
tion time point, attempting to represent the 
genetic diversity and mutation levels of the 
low dose BCRs. Although some mAbs failed to 
express, we carried out SPR analyses for 267 
VRCO1-class and 145 non-VRCO1-class BCRs 
across all participants and postvaccination 
time points (Fig. 7, fig. S29, and data S2). To 
assess affinities of potential naive precursors 
to the postvaccination BCRs, we evaluated 
eOD-GTS8 binding to inferred-germline anti- 
bodies (iGLs) for 118 VRCO1-class and 55 non- 
VRCO01-class postvaccination BCRs. For com- 
parison we also measured eOD-GTS8 binding 
to prevaccination IgG memory BCRs (6 of 7 
VRCOl1-class BCRs from both dose groups 
and 53 of 55 non-VRCO1-class BCRs from the 
low dose group) and human naive VRCO01-class 
(n = 62) and non-VRCO1-class (nm = 72) pre- 
cursors specific for the CD4bs of eOD-GT8 
and isolated previously from 12 other HIV- 
unexposed individuals (21, 43) (Fig. 7 and 
data S2). 

Genes encoding the antibody Fv regions 
were synthesized by GenScript and cloned 
into antibody expression vectors pCW-CHIg- 
hGl, pCW-CLIg-hL2, and pCW-CLIg-hk. Mono- 
clonal antibodies were produced by GenScript 
using transient transfection in the Expi293F 
system (ThermoFisher) and were purified using 
HiTrap MabSelect SuRe columns (Cytiva). Ad- 
ditional antibody production was conducted 
in house using transient transfection of HEK- 
293F cells (ThermoFisher) and purification 
using rProtein A Sepharose Fast Flow resin 
(Cytiva). We note that the antibodies were 
produced as IgG1, but our B cell sorting and 
BCR sequencing workflow did not distinguish 
between IgG subclasses. 


Antigen production for SPR 


His-tagged monomeric and trimeric antigens 
were produced by transient transfection of 
HEK-293F cells (ThermoFisher) and purified 
by immobilized metal ion affinity chromatog- 
raphy (IMAC) using HisTrap excel columns 
(Cytiva) followed be size-exclusion chroma- 
tography (SEC) using either Superdex 75 10/ 
300 GL or Superdex 200 Increase 10/300 GL 
columns (Cytiva). 


SPR 


We measured the kinetics and affinities of 
antibody-antigen interactions on a Carterra 
LSA instrument using HC30M or CMDP sensor 
chips (Carterra) and 1x HBS-EP+ pH 7.4 run- 
ning buffer (20x stock from Teknova, cat. no. 
H8022) supplemented with BSA at 1 mg/ml. 
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We followed Carterra software instructions to 
prepare chip surfaces for ligand capture. In a 
typical experiment, approximately 2500 to 2700 
RU of capture antibody (SouthernBiothech 
catalog no. 2047-01) in 10 mM Sodium Acetate 
pH 4.5 was amine coupled. The critical detail 
here was the concentration range of the amine 
coupling reagents and capture antibody. We 
used N-hydroxysuccinimide (NHS) and 1-ethyl- 
3-(3-dimethylaminopropyl) carbodiimidehy- 
drochloride (EDC) from Amine Coupling Kit 
(GE order code BR-1000-50). As per kit instruc- 
tion 22-0510-62 AG, the NHS and EDC should 
be reconstituted in 10 ml of water each to give 
11.5 mg/ml and 75 mg/ml respectively. How- 
ever, the highest coupling levels of capture 
antibody were achieved by using 10 times 
diluted NHS and EDC during surface prepa- 
ration runs. Thus, in our runs the concentra- 
tions of NHS and EDC were 1.15 mg/ml and 
7.5 mg/ml. The concentrated stocks of NHS 
and EDC could be stored frozen in —20°C for 
up to 2 months without noticeable loss of ac- 
tivity. The SouthernBiotech capture antibody 
was buffer exchanged into 10 mM Sodium 
Acetate pH 4.5 using Zeba spin desalting col- 
umns 7K MWCO 0.5m (catalog no. 89883 
from Thermo) and was used at concentration 
0.25 mg/ml with 20 min contact time. Phos- 
phoric acid 1.7% was our regeneration solution 
with 60 s contact time and injected three 
times per each cycle. Solution concentration of 
ligands was around 5 ug/ml and contact time 
was 3 min. Raw sensograms were analyzed 
using Kinetics software (Carterra), interspot 
and blank double referencing, Langmuir 
model. Analyte concentrations were quanti- 
fied on NanoDrop 2000c Spectrophotometer 
using absorption signal at 280 nm. A typical 
SPR run tested six different analyte concen- 
trations using a dilution factor of four. Maxi- 
mum analyte concentration for eOD-GT8 
was generally 10 uM, except for weak binders 
which were generally rerun at higher maxi- 
mum analyte concentrations of 50 or 118 uM. 
For eOD-GT6 and eOD-GT6 variant analytes, 
maximum analyte concentrations were 9 to 
16 uM, but weak or expected weak binders 
were run at 37 to 63 uM. For MD39 trimer, 
maximum analyte concentration was 4 or 
11 uM, but most interactions were tested at 
11 uM. For core-gp120, maximum analyte con- 
centration were 11 or 46 uM. For best results, 
analyte samples were buffer exchanged into 
the running buffer using desalting columns or 
dialysis. We typically covered a broad range of 
affinities in our runs, and the best referencing 
practices were different depending on how 
fast the off-rate was for each particular ligand. 
For fast off-rates (>9 x 10° 1/s 1 x 10°? 1/s) we 
used automated batch referencing that in- 
cluded overlay y-align and higher analyte con- 
centrations. For slow off-rates (< 9 x 10°? 1/s), 
we used manual process referencing that in- 
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cluded serial y-align and lower analyte con- 
centrations. After automated data analysis by 
the Carterra Kinetics software, we also per- 
formed additional filtering to remove datasets 
with highest response signals smaller than 
signals from negative controls. This additional 
filtering was performed automatically using 
an R-script. The script also identified ligands 
for which capture on the sensor chip was in- 
sufficient, and measurements involving those 
ligands were excluded from subsequent analy- 
ses. Many interactions were measured more 
than once. Whether or not multiple measure- 
ments were available, we used the following 
algorithm to select the representative measure- 
ment for any particular interation: 

1. Ifno measurement resulted in a kinetic-fit 
Kp from the Carterra Kinetics software anal- 
ysis, the Kp value was set to =100 uM. 

2. If all available measurements had kinetic- 
fit Kps from the Carterra Kinetics software 
analysis that were =5 times the maximum 
analyte concentration used in the measure- 
ment, the Kp value was set to =100 uM. The 
lowest max analyte concentration for which 
this was invoked was 37 uM, in which case 
the kinetic-fit Kp of >185 uM was reported as 
>100 uM. 

3. If at least one measurement had a kinetic- 
fit Kp from the Carterra Kinetics software anal- 
ysis, and if ko, and Xog¢ were within range for 
the instrument (10 min M™ < k,< 1 x 10° min 
M?*;1x10° min! < kog <5 x 107 min”), the 
measurement with the lowest chi-squared 
fit value was chosen as the representative 
measure. 

4. If all available measurements with a 
kinetic-fit Kp had a Xo¢¢ that was out of range 
(this only occurred with fast Ao¢,), then if the 
kinetic-fit Kp was within a factor of three to 
the equilibrium-fit Kp, the ratio of kinetic Kp 
to equilibrium Kp was calculated, the mea- 
surement with ratio closest to one was chosen 
as the representative, and equilibrium Kp was 
reported for that measurement. If no kinetic- 
fit Kp was within a factor of three to the 
equilibrium-fit Kp, the Kp was reported as 
2100 uM. The reporting of equilibrium-fit 
Kps was not common: in Fig. 7, n = 2 of 267 
postvaccination VRCO1-class Kp values, and 
n = 10 of 145 postvaccination non-VRCO1-class 
Kp values, were reported as the equilibrium- 
fit Kp. 

5. We had no cases of X,,, out of range, thus the 
algorithm did not have to deal with that case. 


SPR reproducibility 


The majority of the iGL and postvaccination 
interactions reported in Fig. 7 were also mea- 
sured using a different instrument (Biacore 8k) 
and different analyte preparations and often 
different ligand preparations. The Kps mea- 
sured by the 8k were highly correlated with 
those measured by the Carterra LSA, and the 
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conclusions from the two datasets were the 
same. Also, for the naive precursor Kps mea- 
sured using the Carterra and reported in Fig. 7, 
we previously measured and published the 
same interactions using a different instrument 
(Proteon XPR) with different analyte and ligand 
preparations (2/, 43); the results of the new and 
previous analyses were consistent. 


Regression analyses involving 
SPR-measured quantities 


Linear mixed effects models (LMEs) were used 
to analyze the linear dependence of one vari- 
able on another, for the following dependencies: 
(i) dependence of Kp for eOD-GT6 or its var- 
iants on Kp for eOD-GTS8, for VRCO01-class BCRs 
(Fig. 8B); (ii) dependence of iGL Kp for eOD- 
GTS8 on the number of mutations in the BCRs 
from which the iGL was inferred, for VRCO1- 
class and non-VRCO01-class BCRs (fig. S30B); 
(iii) dependence of eOD-GT8 Kp on ko, and 
Kogg, for VRCO1-class and non-VRCO01-class BCRs 
(fig. S32C); (iv) dependence of eOD-GTS8 Kp, 
Kon, and Korg, on the number of BCR V gene 
mutations (figs. S34 to S36). LMEs were used 
to estimate the slope of each association [e.g., 
change in log(Kp) per mutation]. Estimated 
associations only included antibodies for which 
the Kp value was measureable (Kp < 100 uM). 
Each LME had fixed effects for the intercept 
and slope. Random effects for the intercept 
and slope were included for each participant, 
with the exception of estimation for a given 
week (figs. S34B, S35B, and S36B) or for non- 
VRCO1-class iGL (fig. S30B), where a random 
effect for only the intercept was included. 
For non-VRC01-class BCRs at week 16 in figs. 
S34B, S35B, and S36B, only one antibody per 
participant was available, so estimates were 
based on a linear model (Kp, Kon, and Kore). The 
estimated regression line and shaded 95% 
prediction interval were generated using a 
semiparametric bootstrap method (using 1000 
bootstrap datasets). LMEs were fit and P val- 
ues testing the null hypothesis that the fixed 
effect for slope is zero were evaluated using 
Satterthwaite’s degrees of freedom method 
using the ImerTest package in R (93). Fora 
logio-transformed y-value depending linearly 
on x, as in figs. S34 to S36 the effect of x on y is 
given by log(y2)-log(y;) = [Xo-x;]-s, where s is 
the slope. This is equivalent to yo/y, - 10°?*"!, 
Thus, in figs. S34 to S36 where x represents 
mutations and y represents log(Kp) or log(Kon) 
or log(Ko¢p), an increase of n mutations going 
from x, to X_ will lead to a change in y as: 
y2/y; = 10’. For a logyo-transformed y-value 
depending linearly on a log;9-transformed 
x-value, as in Fig. 8B, the effect of x on y is 
given by log(y2)-log(y1) = [log(x2)-log(x)]-s, 
where s is the slope. This is equivalent to 
Yo/Yq = (X2/X,)*. Thus, in Fig. 8B, where x rep- 
resents the Kp for eOD-GT6 or a GT6 variant, 
and y represents Kp for eOD-GTS8, an increase 
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in the Kp for eOD-GT6 or a GT6 variant by a 
factor of a will lead to a change in the Kp for 
eOD-GTS according to yo/y, = o°. 


Statistical analysis 


Placebos in both groups were pooled into a 
single control group for all statistical analy- 
ses. Confidence intervals for frequencies and 
response rates were based on the Wilson score 
method (90). Spearman correlations were used 
to evaluate the association between immune 
responses. To assess if response rates differed 
by group, Barnard’s exact test was used. To 
assess if response rates differed over time 
within a group, McNemar’s test was used to 
account for paired data. Response magnitude 
comparisons between groups were compared 
using the Wilcoxon rank sum test. Response 
magnitude comparisons between time points 
were performed using the Wilcoxon signed- 
rank test to account for paired data. Com- 
parisons of percent mutation levels over time 
(Fig. 5, fig. S20, and tables S46 to S49) were 
performed using the Wilcoxon signed-rank test 
to account for paired data. Regression analyses 
were performed as described in Regression 
analyses involving SPR-measured quantities. 
Statistical quantile analyses were carried out 
using R and Pandas (https://github.com/pandas- 
dev/pandas). The R language and tidyverse R 
packages were used for graphical and statis- 
tical analysis (94, 95). 


Figure generation 


Most figures were generated with either 
Matplotlib (96) or a custom port of the Seaborn 
package that incorporates Wilson confidence 
intervals into the statistical analysis [(97); 
https://github.com/jwillisO720/seaborn-fork], 
and with Adobe Illustrator for final composi- 
tion. Figures S4 and S40 were produced using 
the ggplot package in R. Figure generation 
can be found in the accompanying data anal- 
ysis repository (https://github.com/SchiefLab/ 
GoOl#figures). Figures S5, S6, S8, and S9 were 
produced using Intaglio (https://intaglio.en. 
softonic.com/mac). Fig. S10 was produced 
using BioRender (BioRender.com), and the 
publishing license can be found in the data 
repository. 


REFERENCES AND NOTES 


1. A.S. Fauci, An HIV vaccine is essential for ending the HIV/AIDS 
pandemic. JAMA 318, 1535-1536 (2017). doi: 10.1001/ 

jama.2017.13505; pmid: 29052689 

2. A. Pegu et al., A meta-analysis of passive immunization studies 

shows that serum-neutralizing antibody titer associates with 

protection against SHIV challenge. Cell Host Microbe 26, 

336-346.e3 (2019). doi: 10.1016/j.chom.2019.08.014; 

pmid: 31513771 

3. L. Corey et al., Two randomized trials of neutralizing antibodies 

‘0 prevent HIV-1 acquisition. N. Engl. J. Med. 384, 1003-1014 

(2021). doi: 10.1056/NEJMoa2031738; pmid: 33730454 

4. P.B. Gilbert et al., Neutralization titer biomarker for antibody- 

mediated prevention of HIV-1 acquisition. Nat. Med. 28, 

924-1932 (2022). doi: 10.1038/s41591-022-01953-6; 

pmid: 35995954 


Leggat et al., Science 378, eadd6502 (2022) 


10. 


i. 


12. 


19. 


20. 


2. 


22. 


23. 


24. 


25. 


26. 


X. Xiao, W. Chen, Y. Feng, D. S. Dimitrov, Maturation pathways 
of cross-reactive HIV-1 neutralizing antibodies. Viruses 1, 
802-817 (2009). doi: 10.3390/v1030802; pmid: 21994570 


X. Xiao et al., Germline-like predecessors of broadly 
neutralizing antibodies lack measurable binding to HIV- 


1 envelope glycoproteins: Implications for evasion of immune 
responses and design of vaccine immunogens. Biochem. 
Biophys. Res. Commun. 390, 404-409 (2009). doi: 10.1016/ 


j.bbre.2009.09.029; pmid: 19748484 
D. S. Dimitrov, Therapeutic antibodies, vaccines and 
antibodyomes. MAbs 2, 347-356 (20 
mabs.2.3.11779; pmid: 20400863 

M. Pancera et al., Crystal structure o 


0). doi: 10.4161/ 


PG16 and chimeric 


dissection with somatically related PG9: Structure-function 
analysis of two quaternary-specific antibodies that effectively 


neutralize HIV-1. J. Virol. 84, 8098-8 
JVI.00966-10; pmid: 20538861 


10 (2010). doi: 10.1128/ 


T. Zhou et al., Structural basis for broad and potent 
neutralization of HIV-1 by antibody VRCOL. Science 329, 


811-817 (2010). doi: 10.1126/science. 


192819; pmid: 20616231 


M. Bonsignori et al., Analysis of a clonal lineage of HIV-1 
envelope V2/V3 conformational epitope-specific broadly 
neutralizing antibodies and their inferred unmutated common 
ancestors. J. Virol. 85, 9998-10009 (2011). doi: 10.1128/ 


JVI.05045-11; pmid: 21795340 


J. F. Scheid et al., Sequence and structural convergence of 
broad and potent HIV antibodies that mimic CD4 binding. 


Science 333, 1633-1637 (20 


pmid: 21764753 


1). doi: 10.1126/science.1207227; 


S. Hoot et al., Recombinant HIV envelope proteins fail to 


engage germline versions of 
9, e1003106 (2013). doi: 10. 


pmid: 23300456 


J. Jardine et al., Rational HIV 
specific germline B cell recep’ 
(2013). doi: 10.1126/science. 


anti-CD4bs bNAbs. PLOS Pathog. 
371/journal.ppat.1003106; 


immunogen design to target 
tors. Science 340, 711-716 
234150; pmid: 23539181 


A. T. McGuire et al., Engineering HIV envelope protein to 


activate germline B cell rece! 


ptors of broadly neutralizing 


anti-CD4 binding site antibodies. J. Exp. Med. 210, 655-663 


(2013). doi: 10.1084/jem.20 


22824; pmid: 23530120 


B. F. Haynes, G. Kelsoe, S. C. Harrison, T. B. Kepler, 
B-cell-lineage immunogen design in vaccine development with 
HIV-1 as a case study. Nat. Biotechnol. 30, 423-433 (2012). 
doi: 10.1038/nbt.2197; pmid: 22565972 


. H. X. Liao et al., Co-evolution of a broadly neutralizing 


HIV-1 antibody and founder virus. Nature 496, 469-476 
(2013). doi: 10.1038/nature12053; pmid: 23552890 

F. Gao et al., Cooperation of B cell lineages in induction of 
HIV-1-broadly neutralizing antibodies. Cel! 158, 481-491 
(2014). doi: 10.1016/j.cell.2014.06.022; pmid: 25065977 


. M. Bonsignori et al., Maturation pathway from germline to 


broad HIV-1 neutralizer of a CD4-mimic antibody. Cel! 165, 


449-463 (2016). doi: 
pmid: 26949186 


0.1016/j.cell.2016.02.022; 


M. Bonsignori et al., Staged induction of HIV-1 glycan-dependent 


broadly neutralizing an 
(2017). doi: 10.1126/sci 
K. 0. Saunders et al., 

antibody mutations by 
366, eaay7199 (2019). 
pmid: 31806786 

.G. Jardine et al., HIV- 


458-1463 (2016). doi: 
J. M. Steichen et al., H 


antibodies. Immunity 4: 


j.immuni.2016.08.016; 


pmid: 27610569 


ibodies. Sci. Trans/. Med. 9, eaai7514 
transimed.aai7514; pmid: 28298420 
argeted selection of HIV-specific 
engineering B cell maturation. Science 
doi: 10.1126/science.aay7199; 


broadly neutralizing antibody precursor 
B cells revealed by germline-targeting immunogen. Science 351, 
0.1126/science.aad9195; pmid: 27013733 
V vaccine design to target germline 
precursors of glycan-dependent broadly neutralizing 

5, 483-496 (2016). doi: 10.1016/ 
pmid: 27617678 

A. Escolano et al., Sequential immunization elicits broadly 
neutralizing anti-HIV-1 antibodies in lg knockin mice. Cell 166, 
445-1458.el2 (2016). doi: 10.1016/j.cell.2016.07.030; 


M. Medina-Ramirez et al., Design and crystal structure of a 
native-like HIV-1 envelope trimer that engages multiple broadly 
neutralizing antibody precursors in vivo. J. Exp. Med. 214, 
2573-2590 (2017). doi: 10.1084/jem.20161160; 


pmid: 28847869 


J. M. Steichen et al., A generalized HIV vaccine design strategy 
for priming of broadly neutralizing antibody responses. Science 
366, eaax4380 (2019). doi: 10.1126/science.aax4380; 


pmid: 31672916 


B. Briney et al., Tailored immunogens direct affinity 
maturation toward HIV neutralizing antibodies. Cell 166, 


2 December 2022 


27. 


28. 


29. 


30. 


31, 


32. 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


4l. 


42. 


43. 


4A. 


45. 


46. 


1459-1470.el1 (2016). d 
pmid: 27610570 
M. Tian et al., Induction o 


oi: 10.1016/j.cell.2016.08.005; 


HIV neutralizing antibody lineages in 


mice with diverse precursor repertoires. Cell 166, 1471-1484. 


el8 (2016). doi: 10.1016/j 


K. R. Parks et al., Overco! 


cell.2016.07.029; pmid: 27610571 
ming steric restrictions of VRCOL 


HIV-1 neutralizing antibodies through immunization. Cell Rep. 
29, 3060-3072.e7 (2019). doi: 10.1016/j.celrep.2019.10.071; 


pmid: 31801073 
X. Chen et al., Vaccinatio! 
model of diverse unmuta' 


n induces maturation in a mouse 
ed VRCOl-class precursors to 


HlV-neutralizing antibodies with >50% breadth. Immunity 54, 


324-339.e8 (2021). doi: 
pmid: 33453152 


0.1016/).immuni.2020.12.014; 


L. Verkoczy et al., Induction of HIV-1 broad neutralizing 
antibodies in 2F5 knock-in mice: Selection against membrane 
proximal external region-associated autoreactivity limits 
T-dependent responses. J. Immunol. 191, 2538-2550 (2013). 


doi: 10.4049/jimmunol.1300971; pmid: 23918977 


T. Bradley et al., HIV-1 envelope mimicry of host enzyme 
kynureninase does not disrupt tryptophan metabolism. 


J. Immunol. 197, 4663-4673 (2016). doi: 10.4049/ 


jimmunol.1601484; pmid: 
R. Zhang et al., Initiation 
gp41 neutralizing B cell li 


27849170 
of immune tolerance-controlled HIV 
neages. Sci. Transl. Med. 8, 336ra62 


(2016). doi: 10.1126/scitranslmed.aaf0618; pmid: 27122615 
S. M. Alam et al., Mimicry of an HIV broadly neutralizing 
antibody epitope with a synthetic glycopeptide. Sci. Transl. 


Med. 9, eaai7521 (2017). 
pmid: 28298421 


doi: 10.1126/scitranslmed.aai7521; 


T. Zhou et al., Quantification of the impact of the HIV-1-glycan 
shield on antibody elicitation. Cell Rep. 19, 719-732 (2017). 
doi: 10.1016/j.celrep.2017.04.013; pmid: 28445724 

V. Dubrovskaya et al., Targeted N-glycan deletion at the 
receptor-binding site retains HIV Env NFL trimer integrity and 
accelerates the elicited antibody response. PLOS Pathog. 13, 
e1006614 (2017). doi: 10.1371/journal.ppat.1006614; 


pmid: 28902916 


K. Xu et al., Epitope-based vaccine design yields fusion 


pmid: 29549260 


binding hotspots develop 
567-584.e19 (2019). doi: 
pmid: 31348886 


pmid: 26089355 


peptide-directed antibodies that neutralize diverse strains of 
HIV-1. Nat. Med. 24, 857-867 (2018). doi: 10.1038/ 
s41591-018-0042-6; pmid: 29867235 

D. Fera et al., HIV envelope V3 region mimic embodies key 
eatures of a broadly neutralizing antibody lineage epitope. Nat. 
Commun. 9, 1111 (2018). doi: 10.1038/s41467-018-03565-6; 


R. Kong et al., Antibody lineages with vaccine-induced antigen- 


broad HIV neutralization. Cell 178, 
0.1016/j.cell.2019.06.030; 


J. G. Jardine et al., Priming a broadly neutralizing antibody 
response to HIV-1 using a germline-targeting immunogen. 
Science 349, 156-161 (2015). doi: 10.1126/science.aac5894; 


V. K. Sharma et al., Use of transient transfection for cGMP 
manufacturing of eOD-GT8 60mer, a self-assembling 
nanoparticle germline-targeting HIV-1 vaccine candidate. 
bioRxiv 2022.09.30.510310 [Preprint] (2022); https://doi.org/ 


10.1101/2022.09.30.510310. 


A. P. West Jr., R. Diskin, M. C. Nussenzweig, P. J. Bjorkman, 
Structural basis for germ-line gene usage of a potent class of 
antibodies targeting the CD4-binding site of HIV-1 gp120. 
Proc. Natl. Acad. Sci. U.S.A. 109, E2083-E2090 (2012). 


doi: 10.1073/pnas.1208984109; pmid: 22745174 


T. Zhou et al., Multidonor analysis reveals structural elements, 


genetic determinants, and 


maturation pathway for HIV- 


neutralization by VRCO1-class antibodies. Immunity 39, 245-258 
(2013). doi: 10.1016/j.immuni.2013.04.012; pmid: 23911655 
C. Havenar-Daughton et al., The human naive B cell repertoire 


contains distinct subclasses for a germline-targeting 


HIV-1 vaccine immunogen. Sci. Trans/. Med. 10, eaat0381 
(2018). doi: 10.1126/scitranslmed.aat0381; pmid: 29973404 
J. H. Lee et al., Vaccine genetics of IGHV1-2 VRCO1-class 


broadly neutralizing antibi 


ody precursor naive human B cells. 


NPJ Vaccines 6, 113 (2021). doi: 10.1038/s41541-021-00376-7; 


pmid: 34489473 


P. Dosenovic et al., Immunization for HIV-1 broadly neutralizing 
antibodies in human Ig knockin mice. Cell 161, 1505-1515 


(2015). doi: 10.1016/j.cell 


.2015.06.003; pmid: 26091035 


D. Sok et al., Priming HIV-1 broadly neutralizing antibody 
precursors in human lg loci transgenic mice. Science 353, 
1557-1560 (2016). doi: 10.1126/science.aah3945; 


pmid: 27608668 


26 of 28 


RESEARCH | 


RESEARCH ARTICLE 


47. 


48. 


49. 


50. 


51. 


52. 


53. 


54. 


55. 


56. 


57. 


58. 


59. 


60. 


61. 


62. 


63. 


64. 


65. 


66. 


67. 


Leggat et al., Science 378, eadd6502 (2022) 


R. K. Abbott et a/., Precursor frequency and affinity determine 
B cell competitive fitness in germinal centers, tested with 
germline-targeting HIV vaccine immunogens. Immunity 48, 
133-146.e6 (2018). doi: 10.1016/j.immuni.2017.11.023; 

pmid: 29287996 
D. Huang et al., B cells expressing authentic naive human 
VRCOl1-class BCRs can be recruited to germinal centers and 
affinity mature in multiple independent mouse models. Proc. 
Natl. Acad. Sci. U.S.A. 117, 22920-22931 (2020). doi: 10.1073/ 
pnas.2004489117; pmid: 32873644 

X. Wang et al., Multiplexed CRISPR/CAS9-mediated 
engineering of pre-clinical mouse models bearing native human 
B cell receptors. EMBO J. 40, e105926 (2021). doi: 10.15252/ 
embj.2020105926; pmid: 33258500 

C. Havenar-Daughton et al., Rapid germinal center and 
antibody responses in non-human primates after a single 
nanoparticle vaccine immunization. Cell Rep. 29, 1756-1766.e8 
(2019). doi: 10.1016/j.celrep.2019.10.008; pmid: 31722194 

V. Vigdorovich et al., Repertoire comparison of the B-cell 
receptor-encoding loci in humans and rhesus macaques by 
next-generation sequencing. Clin. Transl. Immunology 5, e93 
(2016). doi: 10.1038/cti.2016.42; pmid: 27525066 

N. Vazquez Bernat et al., Rhesus and cynomolgus macaque 
immunoglobulin heavy-chain genotyping yields comprehensive 
databases of germline VDJ alleles. Immunity 54, 355-366.e4 
(2021). doi: 10.1016/j.immuni.2020.12.018; pmid: 33484642 
C. Hervé, B. Laupéze, G. Del Giudice, A. M. Didierlaurent, 

F. Tavares Da Silva, The how's and what's of vaccine 
reactogenicity. NPJ Vaccines 4, 39 (2019). doi: 10.1038/ 
$41541-019-0132-6; pmid: 31583123 

H. Duan et al., Glycan masking focuses immune responses to 
the HIV-1 CD4-binding site and enhances elicitation of 
VRCOl1-class precursor antibodies. Immunity 49, 301-311.e5 
(2018). doi: 10.1016/j.immuni.2018.07.005; pmid: 30076101 
J. Umotoy et al., Rapid and focused maturation of a 
VRCO1-class HIV broadly neutralizing antibody lineage 
involves both binding and accommodation of the N276-glycan. 
Immunity 51, 141-154.e6 (2019). doi: 10.1016/ 
j.immuni.2019.06.004; pmid: 31315032 

J. G. Jardine et al., Minimally mutated HIV-1 broadly 
neutralizing antibodies to guide reductionist vaccine design. 
PLOS Pathog. 12, e1005815 (2016). doi: 10.1371/journal. 
ppat.1005815; pmid: 27560183 

B. A. Heesters, C. E. van der Poel, A. Das, M. C. Carroll, Antigen 
presentation to B cells. Trends Immunol. 37, 844-854 (2016). 
doi: 10.1016/j.it.2016.10.003; pmid: 27793570 

Y. Kato et al., Multifaceted effects of antigen valency on B cell 
response composition and differentiation in vivo. Immunity 53, 
548-563.e8 (2020). doi: 10.1016/j.immuni.2020.08.001; 
pmid: 32857950 
P. Dosenovic et al., Anti-HIV-1 B cell responses are dependent 
on B cell precursor frequency and antigen-binding affinity. 
Proc. Natl. Acad. Sci. U.S.A. 115, 4743-4748 (2018). 

doi: 10.1073/pnas.1803457115; pmid: 29666227 

L. Mesin et al., Restricted clonality and limited germinal center 
reentry characterize memory B cell reactivation by boosting. 
Cell 180, 92-106.e11 (2020). doi: 10.1016/j.cell.2019.11.032; 
pmid: 31866068 
C. Viant et al., Antibody affinity shapes the choice between 
memory and germinal center B cell fates. Cel! 183, 1298-1311. 
ell (2020). doi: 10.1016/j.cell.2020.09.063; pmid: 33125897 
K. A. Pape, J. J. Taylor, R. W. Maul, P. J. Gearhart, M. K. Jenkins, 
Different B cell populations mediate early and late memory 
during an endogenous immune response. Science 331, 
1203-1207 (2011). doi: 10.1126/science.1201730; 

pmid: 21310965 
G. V. Zuccarino-Catania et al., CD80 and PD-L2 define 
functionally distinct memory B cell subsets that are 
independent of antibody isotype. Nat. Immunol. 15, 631-637 
(2014). doi: 10.1038/ni.2914; pmid: 24880458 

G. E. Phad et al., Extensive dissemination and intraclonal 
maturation of HIV Env vaccine-induced B cell responses. 

J. Exp. Med. 217, e20191155 (2020). doi: 10.1084/jem.20191155; 
pmid: 31704807 
A. Cho et al., Anti-SARS-CoV-2 receptor-binding domain antibody 
evolution after mRNA vaccination. Nature 600, 517-522 (2021). 
doi: 10.1038/s41586-021-04060-7; pmid: 34619745 

F. Muecksch et al., Increased memory B cell potency and 
breadth after a SARS-CoV-2 mRNA boost. Nature 607, 128-134 
(2022). doi: 10.1038/s41586-022-04778-y; pmid: 35447027 

G. D. Tomaras et al., Initial B-cell responses to transmitted 
human immunodeficiency virus type 1: Virion-binding 
immunoglobulin M (IgM) and IgG antibodies followed by 
plasma anti-gp41 antibodies with ineffective control of initial 


68. 


69. 


70. 


71. 


72. 


23. 


74. 


75. 


76. 


77. 


78. 


79. 


80. 


81. 


82. 


83. 


84. 


85. 


86. 


87. 


88. 


89. 


90. 


91. 


viremia. J. Virol. 82, 12449-12463 (2008). doi: 10.1128/ 
JV1.01708-08; pmid: 18842730 

N. L. Yates et al., Vaccine-induced Env V1-V2 IgG3 correlates 
with lower HIV-1 infection risk and declines soon after 
vaccination. Sci. Transl. Med. 6, 228ra39 (2014). doi: 10.1126/ 
scitransimed.3007730; pmid: 24648342 

N. L. Yates et al., HIV-1 envelope glycoproteins from diverse 
clades differentiate antibody responses and durability among 
vaccinees. J. Virol. 92, e€01843-17 (2018). doi: 10.1128/ 
JVI.01843-17; pmid: 29386288 

D. C. Montefiori, Measuring HIV neutralization in a luciferase 
reporter gene assay. Methods Mol. Biol. 485, 395-405 (2009). 
doi: 10.1007/978-1-59745-170-3_26; pmid: 19020839 

M. Sarzotti-Kelsoe et al., Optimization and validation of the 
TZM-bl assay for standardized assessments of neutralizing 
antibodies against HIV-1. J. Immunol. Methods 409, 131-146 
(2014). doi: 10.1016/j.jim.2013.11.022; pmid: 24291345 

C. C. LaBranche et al., HIV-1 envelope glycan modifications that 
permit neutralization by germline-reverted VRCO1-class broadly 
neutralizing antibodies. PLOS Pathog. 14, e1007431 (2018). 

doi: 10.1371/journal.ppat.1007431; pmid: 30395637 

C. C. LaBranche et al., Neutralization-guided design of 

HIV-1 envelope trimers with high affinity for the unmutated 
common ancestor of CH235 lineage CD4bs broadly 
neutralizing antibodies. PLOS Pathog. 15, e1008026 (2019). 
doi: 10.1371/journal.ppat.1008026; pmid: 31527908 

G. Finak, W. Jiang, R. Gottardo, CytoML for cross-platform 
cytometry data sharing. Cytometry A 93, 1189-1196 (2018). 
doi: 10.1002/cyto.a.23663; pmid: 30551257 

G. Finak, M. Jiang, flowWorkspace: Infrastructure for 
representing and interacting with gated and ungated 
cytometry data sets. R package version 4.5.3. 2011. 

P. Van, W. Jiang, R. Gottardo, G. Finak, ggCyto: Next 
generation open-source visualization software for cytometry. 
Bioinformatics 34, 3951-3953 (2018). doi: 10.1093/ 
bioinformatics/bty441, pmid: 29868771 

K. L. Boswell et al., Application of B cell immortalization for the 
isolation of antibodies and B cell clones from vaccine and 
infection settings. bioRxiv 2022.2003.2029.485179 [Preprint] 
(2022); https://doi.org/10.1101/2022.03.29.485179. 

A. A. Upadhyay et a/., BALDR: A computational pipeline for 
paired heavy and light chain immunoglobulin reconstruction in 
single-cell RNA-seq data. Genome Med. 10, 20 (2018). 

doi: 10.1186/s13073-018-0528-3; pmid: 29558968 

B. Briney, D. R. Burton, Massively scalable genetic analysis of 
antibody repertoires. bioRxiv 447813 [Preprint] (2018); 
https://doi.org/10.1101/447813. 

F. Breden et al., Reproducibility and reuse of adaptive immune 
receptor repertoire data. Front. Immunol. 8, 1418 (2017). 

doi: 10.3389/fimmu.2017.01418; pmid: 29163494 

B. Ewing, L. Hillier, M. C. Wendl, P. Green, Base-calling of 
automated sequencer traces using phred. |. Accuracy 
assessment. Genome Res. 8, 175-185 (1998). doi: 10.1101/ 
gr.8.3.175; pmid: 9521921 

B. Ewing, P. Green, Base-calling of automated sequencer 
traces using phred. Il. Error probabilities. Genome Res. 8, 
186-194 (1998). doi: 10.1101/gr.8.3.186; pmid: 9521922 

J. Ye, N. Ma, T. L. Madden, J. M. Ostell, IgBLAST: An 
immunoglobulin variable domain sequence analysis tool. 
Nucleic Acids Res. 41, W34-W40 (2013). doi: 10.1093/nar/ 
gkt382; pmid: 23671333 

G. Navarro, A guided tour to approximate string matching. 

ACM Comput. Surv. 33, 31-88 (2001). doi: 10.1145/375360.375365 
J. Dunbar, C. M. Deane, ANARCI: Antigen receptor numbering 
and receptor classification. Bioinformatics 32, 298-300 
(2016). pmid: 26424857 

N. Vazquez Bernat et al., High-quality library preparation for 
NGS-based immunoglobulin germline gene inference and 
repertoire expression analysis. Front. Immunol. 10, 660 (2019). 
doi: 10.3389/fimmu.2019.00660; pmid: 31024532 

M. M. Corcoran et al., Production of individualized V gene 
databases reveals high levels of immunoglobulin genetic 
diversity. Nat. Commun. 7, 13642 (2016). doi: 10.1038/ 
ncomms13642; pmid: 27995928 
F. Pedregosa et al., Scikit-learn: Machine learning in Python. 
J. Mach. Learn. Res. 12, 2825-2830 (2011). 
B. Briney, K. Le, J. Zhu, D. R. Burton, Clonify: Unseeded antibody 
lineage assignment from next-generation sequencing data. Sci. 
Rep. 6, 23901 (2016). doi: 10.1038/srep23901; pmid: 27102563 
A. Agresti, B. A. Coull, Approximate is better than “exact” for 
interval estimation of binomial proportions. Am. Stat. 52, 
119-126 (1998). 

A. Kovaltsuk et al., Observed Antibody Space: A resource for 
data mining next-generation sequencing of antibody 


2 December 2022 


92. 


93. 


94. 


95. 


96. 


97. 


98. 


99. 


ACKNOWLEDGMENTS 


repertoires. J. Immunol. 201, 2502-2509 (2018). doi: 10.4049/ 
jimmunol.1800708; pmid: 30217829 

. H. Olsen, F. Boyles, C. M. Deane, Observed Antibody Space: 
A diverse database of cleaned, annotated, and translated 
unpaired and paired antibody sequences. Protein Sci. 31, 
41-146 (2022). doi: 10.1002/pro.4205; pmid: 34655133 

A. Kuznetsova, P. B. Brockhoff, R. H. B. Christensen, ImerTest 
package: Tests in linear mixed effects models. J. Stat. Softw. 
82, 1-26 (2017). doi: 10.18637/jss.v082.i13 

H. Wickham et al., Welcome to the tidyverse. J. Open Source 
Softw. 4, 1686 (2019). doi: 10.21105/joss.01686 

R Core Team, R: A Language and Environment for Statistical 
Computing (2019); https://www.R-project.org/. 

J. D. Hunter, Matplotlib: A 2D graphics environment. Comput. 
Sci. Eng. 9, 90-95 (2007). doi: 10.1109/MCSE.2007.55 

M. L. Waskom, seaborn: Statistical data visualization. J. Open 
Source Softw. 6, 3021 (2021). doi: 10.21105/joss.03021 

B. J. DeKosky et al., In-depth determination and analysis of the 
human paired heavy- and light-chain antibody repertoire. 

Nat. Med. 21, 86-91 (2015). doi: 10.1038/nm.3743; 

pmid: 25501908 
J. Willis, SchiefLab/GOOL:First release, Zenodo (2022); 
https://doi.org/10.5281/zenodo.7334877. 


We thank P. Anklesaria, N. Russell, and £. Emini for discussions 


and tria 


planning; and L. Stamatatos, B. Correia, M. Azoitei, 


J. Bohl, S. Deeks, and P. Fast for comments on the manuscript. 
At IAVI, we thank H. Park, L. Sunner, |. Ayanru, H. Bester, 


P. Neuenschwander, D. Todd, K. Rutkowski, R. Edelstein, and 

. Crisafi for clinical management; J. Ackland for overseeing the 
oxicology study; S. Hingorani, A. Elnatan, and D. Zachariah 

or regulatory filings; and V. Tsvetnitsky, E. Sayeed, V. Sharma, 
J. Ackland, K. Syvertsen, S. Pallerla, S. Avula, P. Kishineskaya, 
R. Platt, N. Williams, R. Colacot, and T. Hassell for manufacturing 
oversigh 
for project management at the IAVI Neutralizing Antibody Center 
(NAC) at Scripps. At Fred Hutchinson Cancer Center (FHCC), we 
hank D. Berger, G. Braun, and K. Louis for clinical operations; 


and quality assurance. We thank O. Fagbayi and A. Mosley 


A. Varni, T. Haight, C. Marty, and S. Ameny for sample and assay 
management; A. Heit for FNA method development; and M. Shen and 
J. Hural for laboratory operations. At George Washington University 
(GWU), we thank E. Malkin, S. Henn, A. Desrosiers, and S. Walker 
for clinical operations; and L. Scholte, L. Schellhaas, and L. Hoeweler 
for sample and assay management. Related to BAMA assays at 
Duke, we thank T. McNair, J. Choi, and M. Archibald for technical 
expertise; S. Sawant for data analysis; M. Sampson and A. Sharak 
for data management; J. Lucas for the Good Clinical Laboratory 
Practice (GCLP)-compliant laboratory environment; and 

M. Sarzotti-Kelsoe for quality assurance oversight. For neutralization 
assays at Duke, we thank E. Domin, C. West, and J. Chen. 


At the National 
Center (VRC), we thank J. M. Brenchley for flow cytometry 
support, and M. Prabhakaran and B. Flach for technical 


nstitutes of Health (NIH) Vaccine Research 


advice. At IAVI and Scripps, we thank D. Sok, B. Briney, P. Skog, 


D. Nemazee, and D. Burton for a preclinical in vivo test of 


vaccine and adjuvant. At GlaxoSmithKline Biologicals (GSK), we thank 


M. Koutsoukos, F. Roman, 0. van der Meeren, C. Lorin, and 
C. Laugier for providing ASOlg and for carrying out preclinical 


compatibility and toxicity studies. We also thank C. Havenar-Daughton 
and B. Shakoor for sequences of CLK antibodies, J. H. Lee for 
providing human naive BCR sequences from (44), X. Wang, and F. Batista 
for providing data from (49). Funding: This work was supported 


by the Bill and Melinda Gates Foundation Collaboration for 


AIDS Vaccine Discovery (CAVIMC INV-007368 to D.M. and G.D.T.; 


CCVIMC 


NV-007371 to R.A.K., A.B.M., and M.J.M.; VISC INV- 


008017 and INV-032929 to A.C.d.; VxPDC INV-008352 and INV- 
007375 to IAVI; and NAC INV-007522 and INV-008813 to W.R.S.); 


AVI (including IAVI 167627819 to M.J.M., [AVI AO8031 research 
collaboration agreemen 
and other support to W.R.S.); the IAVI Neutralizing Antibody 
Center (NAC) to W.R.S.; National Institute o 
infectious Diseases (NIAID) PO1 Al094419 (HIVRAD Optimizing 
HIV immunogen-BCR interactions for vaccine development”) 

(to W.R.S.); UM1 Al100663 (Scripps Center 
immunology and Immunogen Discovery) and UM1 Al144462 
(Scripps Consortium for HIV/AIDS Vaccine Development) (to 
W.R.S. and M.J.M.); and 
U19AI128914 (HIPC), and UMIAI068618 (HV 


with Karolinska Intitutet to G.B.K.H., 


Allergy and 


or HIV/AIDS Vaccine 


UMI1AI069481 (Seattle-Lausanne CTU), 
NLC) to M.J.M. 


his work was also supported by the Swedish Research Council 
(grant 2017-00968 to G.B.K.H.) and by the Ragon Institute of 
MGH, MIT, and Harvard (to W.R.S.). Author contributions: K.W.C., 
A.C.d., S.M., D.S.L., J.R.W., W.JF., M.J.M., A.B.M., and W.R.S. 


27 of 28 


RESEARCH | RESEARCH 


ARTICLE 


designed the study. D.J.L., K.W.C., MJ.M., R.A.K., and A.B.M. 
supervised B cell sorting, PCR, and sequencing. J.R.W., W.J.F., 
A.C.d., S.M., G.F., and W.R.S. carried out data organization and 
BCR sequence analysis. L.B.-F., A.Sr., J.R.P., R.E.W., A.Se., J.Br., 
A.M.R., W.H., and D.R.A. carried out or assisted B cell sorting 


and/or PCR. S.M., F.R., A-Lo., and D.S.L. provided trial planning. V.P. 


and D.S.L. monitored safety and adherence 


0 protocol. N.L.Y., 


L.D.W., and G.D.T. supervised BAMA studies. K.G., H.G., and D.M. 
supervised neutralization assays. M.M.C. and G.B.K.H. provided 
VH1-2 genotype analysis. K.W.C., L.B.-F., A.C., and A.B.M. 
developed the B cell assay pre-trial. A.T. and D.M.B. assisted 
data organization. M.R. contributed gating design. S.C. provided 
guidance on FNAs and FACS. J.M., O.Ko., N.K., J.Be., D.D., and 
M.J.M. supervised all clinical activities. O.Ka. and A.Li. performed 


SPR analysis. C.A.C., T.Sc., X.H., and W.R.S. 
studies and analyzed results. R.T., E.G., S.E 


. planned SPR 
. N.A., D.L., T.-M.M., 


M.K., and B.G. produced proteins and antibodies for SPR. 
W.J.F., C.R.M., and A.C.d. performed statistical analyses. W.R.S. 


Leggat et al., Science 378, eadd6502 (2022) 


wrote the main text; D.J.L., K.W.C., J.R.W., C.A.C., O.Ka., and 
W.R.S. wrote the supplementary materials; and all authors 
commented. J.R.W., W.J.F., A.C.d., T.Si., K.W.C., D.J.L., and 
W.R.S. created figures and tables. Competing interests: W.R.S. 
and S.M. are inventors on patents filed by Scripps and IAVI 

on the eOD-GT8 monomer and 60mer immunogens. Data and 
materials availability: All BCR sequences and FACS analysis 
files produced in this study are available in the public 

data repository https://github.com/SchiefLab/GOO1 (99). 

The repository contains four separate modules: (i) FACS analysis, 
(ii) BCR sequence analysis, (iii) combined B cell frequency 

and BCR sequencing analysis, and (iv) figure and table generation. 
Instructions for running each module of the repository are provided 
in the README file. Sequences for antibody expression vectors 
were deposited in GenBank under accession numbers ON512569, 
ON5125670, and ON5125671. All other data are available in the 
main text or supplementary materials. License information: 
Copyright © 2022 the authors, some rights reserved; exclusive 


2 December 2022 


licensee American Association for the Advancement of Science. No 
claim to original US government works. https://www.science.org/ 
about/science-licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.add6502 
Supplementary Text 

Figs. S1 to S41 

Tables S1 to S50 

References (100-114) 

MDAR Reproducibility Checklist 

Data S1 and S2 


Submitted 24 June 2022; resubmitted 28 September 2022 
Accepted 27 October 2022 
10.1126/science.add6502 


28 of 28 


RESEARCH 


RESEARCH ARTICLE SUMMARY 


SYNTHETIC BIOLOGY 


Monitoring of cell-cell communication 
and contact history in mammals 


Shaohua Zhang}, Huan Zhaot, Zixin Liu, Kuo Liu, Huan Zhu, Wenjuan Pu, Lingjuan He, 


Rong A. Wang, Bin Zhou* 


INTRODUCTION: Cell-cell communication through 
direct contact is pervasive in multicellular 
organisms and is essential in many funda- 
mental biological processes. The ability to ex- 
perimentally track such cell-cell communication 
signaling could substantially advance our 
understanding of diverse biological processes 
from embryogenesis to tumorigenesis. The ex- 
isting technologies are not suitable to monitor 
and trace cell-cell contact for long-term in vivo 
studies, because many biological processes 
such as embryogenesis, tumorigenesis, and 
tissue regeneration develop over time after 
the initial cell-cell contact. 


RATIONALE: The Notch pathway transmits sig- 
naling through direct cell-cell contact in many 
cellular processes during development and 
homeostasis. In the canonical Notch pathway, 
upon cell contact, the Notch ligand on one cell 
binds to the Notch receptor on another cell 
to trigger a signaling pathway that leads to 
transcription activation of particular genes. 
To understand the dynamic in vivo cell-cell 
communications over time, we developed an 
intercellular genetic approach using the syn- 
thetic Notch pathway (synNotch) that converts 
acellular contact event into a controllable tran- 
scriptional program. We engineered in mice an 
artificial Notch ligand, a membrane-tethered 


green fluorescent protein (GFP), into one cell 
type (the sender cell) and an artificial receptor 
in which the extracellular and intracellular 
domains of Notch were replaced with an anti- 
GFP nanobody and the tetracycline transac- 
tivator, respectively, into another cell type (the 
receiver cell). Contact between the sender and 
receiver cells triggered synNotch signaling that 
activated the downstream transcriptional pro- 
grams in the receiver cell in vivo. To reveal the 
ongoing cell-cell contact, as reflected by synNotch 
activation in a receiver cell after direct contact 
with a sender cell, we used a tet-off system to 
express detectable reporters. To trace cell con- 
tact history, we used the Cre-loxP system to 
genetically fate map receiver cells, along with 
their progenies, permanently after cell contact. 


RESULTS: In the intercellular genetic system, we 
demonstrated that endothelial cells (receiver 
cells) in the developing heart were genetically 
labeled after contact with neighboring cardio- 
myocytes (sender cells). The endothelial cells 
that had contact with cardiomyocytes in early 
embryogenesis were permanently tagged with 
the genetic reporter. Their progenies migrated 
into liver and subsequently formed a substan- 
tial portion of the vasculature there, suggesting 
that part of the liver vasculature originates from 
the developing heart during embryogenesis. Ap- 


Plication of these synNotch mice in tumorigen- 
esis revealed the contact history between tumor 
cells (Sender cells) and endothelial cells (receiver 
cells) during tumor growth and revealed that 
tumor vessels not only expanded within the tu- 
mor but also outgrew into the periphery of the 
tumor and had strong angiogenic properties. 
Upon contacting tumor cells, these endothelial 
cells gained properties in angiogenic, migratory, 
and inflammatory responses. Additionally, we 
generated mice for Cre-induced synNotch ligand 
or receptor expression, enabling broadly appli- 
cable approaches for genetic labeling of cell- 
cell contact and study of cell contact signaling 
in vivo. Engineering both the synNotch ligand 
and receptor, as well as different genetic readouts, 
in one mouse, we demonstrated simultaneous 
yet distinct recording of not only ongoing cell- 
cell contact but also historical cell-cell contact. 


CONCLUSION: Our work provides a genetic sys- 
tem for recording cell-cell contact and cell 
contact history in vivo. The implications of our 
findings are that endothelial cells in the de- 
veloping heart migrate and contribute to the 
liver vasculature, whereas endothelial cells in 
tumors not only expand within the tumor but 
also grow outward into the boundary-adjacent 
normal tissue with robust angiogenesis. The 
suite of new synNotch mouse lines provides a 
toolbox for genetic labeling and tracing of con- 
tacts between any cell type, offering a useful 
approach for studying dynamic in vivo cell-cell 
communications and the resulting cell fate 
plasticity in diverse life science fields. 
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Monitoring of cell-cell contact in the heart. (A) Whole-mount fluorescence image of a mouse embryo showing that cardiomyocytes and endothelial cells express 
the synNotch ligand (green) and the synNotch receptor (purple). (B to D) Whole-mount images of synNotch neonatal hearts shown in green (B), blue (C), and 
red (D) fluorescence channels. Present and past cell contact signaling are displayed by blue and red fluorescence, respectively. 


SCIENCE science.org 


2 DECEMBER 2022 * VOL 378 ISSUE 6623 965 


RESEARCH 


RESEARCH ARTICLE 


SYNTHETIC BIOLOGY 


Monitoring of cell-cell communication 
and contact history in mammals 


Shaohua Zhang"t, Huan Zhao", Zixin Liu’, Kuo Liu, Huan Zhu’, Wenjuan Pu?, Lingjuan He®, 


Rong A. Wang’, Bin Zhouh?>* 


Monitoring of cell-cell communication in multicellular organisms is fundamental to understanding diverse 
biological processes such as embryogenesis and tumorigenesis. To track cell-cell contacts in vivo, we 
developed an intercellular genetic technology to monitor cell-cell contact and to trace cell contact 
histories by permanently marking contacts between cells. In mice, we engineered an artificial Notch 
ligand into one cell (the sender cell) and an artificial receptor into another cell (the receiver cell). 
Contact between the sender and receiver cells triggered a synthetic Notch signaling that activated 
downstream transcriptional programs in the receiver cell, thereby transiently or permanently labeling 
it. In vivo cell-cell contact was observed during development, tissue homeostasis, and tumor growth. This 
technology may be useful for studying dynamic in vivo cell-cell contacts and cell fate plasticity. 


ell-cell communication is ubiquitous in 

multicellular organisms and is essential 

in fundamental biological processes, in- 

cluding embryogenesis, immune responses, 

stem cell fate decisions, and tumorigene- 
sis (-4). Technologies that enable monitoring 
and recording of cell-cell communication have 
driven advances in our basic understanding of 
many biological processes (5-12). For example, 
intercellular enzymatic proximity labeling has 
been used to monitor cell-cell communication 
in vivo and in vitro (8, 7), providing informa- 
tion on immune cell function, and the genera- 
tion of a cell-penetrating fluorescent protein 
secreted from tumor cells can be used to label 
cells in tumor metastatic niches (13). However, 
for long-term tracing of cell contact, existing 
methods such as tagging of surface proteins or 
secretion of fluorescent proteins are unsuit- 
able because many cell-cell interactions are 
dynamic and transient. Existing technologies 
also cannot monitor and permanently trace 
dynamic interactions between T cell progen- 
itors and thymus cells before mature T cells 
migrate to lymphoid organs (/4), nor can they 
track transient interactions between tumor 
cells and their niche cells during dissemina- 
tion and metastasis (73, 15). Indeed, many bio- 
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logical processes occur weeks or months after 
(transient) cell-cell contact, such as tumor im- 
mune responses and tissue regeneration (/6-18). 
To understand the dynamics of complex in vivo 
cell-cell communication over time, we devel- 
oped a genetic system that converts a cellular 
contact event into a controllable genetic pro- 
gram, enabling subsequent monitoring of 
ongoing cell-cell contacts or tracing of cell 
contact history in vivo (fig. S1). For broader ap- 
plication, we also generated a Cre recombinase- 
induced genetic tool box that can be used for 
identifying and manipulating cell lineages di- 
rectly influenced by cell-cell contact in vivo. 


Results 
Generation of the gLCCC system for monitoring 
ongoing cell-cell contacts 


The Notch signaling pathway mediates com- 
munication between neighboring cells in many 
cellular processes during development and 
homeostasis (19, 20). In the canonical Notch 
signaling pathway (21), binding of the Notch li- 
gand to the Notch receptor triggers y-secretase- 
mediated cleavage within the Notch trans- 
membrane region, and the cleaved Notch 
intracellular domain then translocates into 
the nucleus and activates the transcription of 
target genes (Fig. 1A). Like endogenous delta 
digand)-Notch (receptor) communication, ac- 
tivation of a synthetic Notch (synNotch) path- 
way depends on direct cell-cell contact, which 
has been used to program contact-dependent 
transcriptional regulation (5-7, 10, 22, 23). 
Given the contact-dependent feature of syn- 
Notch activation, we explored its application 
for in vivo genetic monitoring and tracing of 
cell-cell contacts. We engineered two separate 
mouse alleles through gene targeting: One 
allele encoded a synNotch ligand and the other 
a synNotch receptor (Fig. 1B). The synNotch 
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ligand is a membrane-tethered green fluores- 
cent protein (mGFP) that serves as an artificial 
Notch ligand. The synthetic receptor for Notch 
is a modified Notch protein in which the ex- 
tracellular and intracellular domains are re- 
placed with a Myc-tagged anti-GFP nanobody 
(aGFP) and the tetracycline (tet) transactiva- 
tor (tTA), respectively, which is abbreviated 
as aGFP-N-tTA (Fig. 1B). To mitigate the po- 
tential influence of artificial ligand-receptor 
binding on dynamic cell-cell interactions such 
as postcontact dissociation, we used a low- 
affinity aGFP nanobody (LaG17) that has an 
affinity for GFP that is 1/70th of that of another 
aGFP nanobody (LaG16) (5, 24). To target 
the synNotch ligand to sender cells [e.g., 
cardiomyocytes (CMs)], mGFP was driven 
under a CM-specific promoter. To target the 
synNotch receptor to receiver cells [e.g., endo- 
thelial cells (ECs)], aGFP-N-tTA was driven 
under an EC-specific promoter. In mice car- 
rying both sender and receiver alleles, as 
well as a reporter allele such as tetO-LacZ 
that encodes the f-galactosidase enzyme for 
X-galactosidase (X-gal) staining detection (25, 26), 
if a sender cell contacts a receiver cell, then the 
binding between GFP and aGFP-N-tTA trig- 
gers cleavage of the Notch transmembrane 
domain, thereby releasing tTA (Fig. 1, C and 
D). Cleaved tTA then translocates into the 
nucleus to activate the tetO-LacZ reporter sys- 
tem (Fig. 1D). We refer to mice carrying these 
three genetic elements (synNotch ligand, re- 
ceptor, and reporter) as gLCCC mice for “ge- 
netic labeling of cell-cell contact.” 

We used CMs as sender cells and ECs as 
receiver cells (Fig. 1, E and F) and generated 
cardiac troponin T2 (Tnnt2)-mGFP knock-in 
mice to express mGFP specifically in TNNT2* 
CMs (Fig. 1E and fig. S2) and cadherin 5 (Cdh5)- 
aGFP-N-tTA knock-in mice to express aGFP- 
N-tTA in CDH5* ECs (Fig. IF and fig. S3). 
Immunostaining and confocal examination 
of Tnnt2-mGFP;Cdh5-aGFP-N-tTA embryos 
documented the close physical proximity of 
GFP* CMs and aGFP* ECs in the heart (Fig. 
1G). We generated Tnnt2-mGFP;Cdh5-aGFP- 
N-tTA;tetO-LacZ (or heart-gLCCC) mice and 
tested whether contact between sender and 
receiver cells triggered LacZ expression spe- 
cifically in cardiac ECs (Fig. 1D). X-gal stain- 
ing of mouse tissues was used to detect LacZ 
expression in ECs (25, 26). Although no LacZ- 
expressing cells were observed in littermate 
controls with other genotypes, we did observe 
LacZ expression in the hearts from embryonic 
day 9.5 (E9.5) heart-gLCCC embryos (Fig. 1H). 
Immunostaining of sections prepared from 
these embryos revealed specific LacZ expres- 
sion in the endocardial ECs located adjacent 
to GFP* CMs (Fig. 1, I and J). To confirm that 
LacZ expression was specifically triggered by 
the synNotch pathway, we showed that heart- 
gLCCC embryos exposed to the y-secretase 


lof 11 


RESEARCH | RESEARCH ARTICLE 


Notch pathwa B synNotch Cc LCCC system 
Sender i Y Receiver y 9 y 
Y vend Poin hl No contact Contact 
-secyetase  / 
and < 4>@Qiep- a= «=a | 
on ON / x | a sender | (Receiver Receiver 
i} "A ‘ Nucleus / Xf . N 
Recept ee /  GFP-PDGFR- Myc-anti GFP- 2 ss 
Totem nnn transmembrane domain Notch transmembrane Reporter~ Reporter* 
D gLCCC system E GFP GFP DAPI TNNT2 DAPI Merge Magnification 
Sender Receiver Whole-mount Section 


Tnnt2-mGFP 


=F 


Myc-aGFP Myc-aGFP DAPI 
Whole-mount 


CDH5 DAPI Merge 


Section 


Cdh5-aGFP-N-tTA 


<= 


Tnnt2-mGFP 


Tnnt2-mGFP Cdh5-aGFP-N-tTA Cdh5-aGFP-N-tTA 


tetO-LacZ 


Group 2 


oe 


K 


Tnnt2-mGFP 


Cdh5-aGFP-N-tTA 


\\ Nucleus 


Fig. 1. Genetic labeling of in vivo cell-cell contact. (A) Schematic figure showing 
the canonical Notch signaling pathway. NICD, Notch intracellular domain. 

(B) Schematic showing components of the synNotch pathway: mGFP as ligand and 
oGFP-N-tTA as receptor. aGFP is myc-tagged. (C) Illustration showing the 
gLCCC system for genetic labeling of cell-cell contact. (D) Schematic showing the 
overall design of the gLCCC system. Binding of GFP to the aGFP nanobody leads 
to cleavage of the transmembrane domain of the synNotch receptor by 
y-secretase. This subsequently releases tTA, which then translocates into the 
nucleus and activates LacZ expression. (E to G) Whole-mount and sectional 
immunostaining images of E9.5 Tnnt2-mGFP (E), Cdh5-aGFP-N-tTA (F), and 
Tnnt2-mGFP;Cdh5-aGFP-N-tTA (G) embryos. (H) Whole-mount X-gal staining for 
the detection of the B-galactosidase enzyme encoded by the LacZ gene on E9.5 
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inhibitor N-[N-(3,5-difluorophenacetyl)-l-alanyl]- 
S-phenylglycine t-butyl ester (DAPT) (27) had 
no LacZ" cells (Fig. 1, K and L). To validate Tet- 
responsive expression, we showed that heart- 
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gLCCC embryos exposed to doxycycline (Dox), 
which inactivates tTA (28), had no LacZ” cells 
(Fig. 1, K to M). We also used Tnnt2-mGFP; 
Cdh5-aGEP-N-tTA;tetO-tdT (where tdT is tdTomato) 
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mice to further validate the dynamic contacts 
between ECs and CMs in postnatal hearts (fig. 
S4). Thus, this gLCCC system offers cell contact- 
specific labeling of ECs in contact with CMs. 
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To determine how long cell-cell contact is 
needed to activate the gLCCC system, we cul- 
tured sender and receiver cells in vitro and 
found by fluorescence microscopy that tdT* 
cells started to appear after 4 hours of con- 
tact (fig. S5, A to D). Alternatively, we used 
DAPT to inhibit the gLCCC system at different 
time points and assessed tdT expression after 
36 hours of co-culture. Within 2 hours of cell-cell 
contact, we detected receiver cells expressing 
tdT (fig. S5, E to H). Free mGFP protein did 
not lead to tdT expression in receiver cells (fig. 
S6). Half of cardiac ECs lost tdT 2 days after 
inhibition of contact-activated gLCCC (fig. $7). 

To test whether cardiac ECs continue to in- 
teract with surrounding CMs in the adult heart, 
we collected hearts from 12-week-old Tnnt2- 
mGFP;Cdh5-aGFP-N-tTA;tetO-tdT mice (fig. S8, 
A and B). Immunostaining of heart sections for 
GFP, tdT, and CDH5 showed that a proportion 
of coronary capillary ECs expressed tdT (fig. S8, 
C and D), suggesting that they were in contact 
with CMs. Transmission electron microscopy of 
adult wild-type hearts provided morphological 
evidence showing the close proximity between 
the plasma membranes of CMs and ECs (fig. S8, 
E and F). One advantage of the tdT reporter is 
that it enables the isolation of cells by flow 
cytometry for further characterization of ECs 
that had contact with CMs. Isolated tdT* ECs 
showed increased expression of genes for cel- 
lular respiration, intrinsic apoptotic signaling, 
and cellular responses to stress (fig. S8, G to 
K). In addition to heart-gLCCC, we also gen- 
erated liver-gLCCC with hepatocyte-specific 
promoter albumin-driven mGFP (4lb-mGFP) 
mice and Cdh5-aGFP-N-tTA;tetO-tdT mice and 
found that EC-hepatocyte contact activated 
tdT expression in postnatal day 0 (PO) livers 
(fig. S9) but not in adult livers (fig. S10). This 
suggested that gLCCC could be used to study 
dynamic cell-cell contacts in multiple cell types. 


Generation of a gTCCC system for genetic 
tracing of cell-cell contact history 


To enable tracking of receiver cells that had 
any history of contact with sender cells, we 
incorporated the Cre-loxP system downstream 
of the synNotch pathway (Fig. 2A). The tTA 
reporter was replaced by tetO-Cre and a Rosa26 
(R26)-floxed-Stop-reporter, which is located in 
the ubiquitously active gene locus Rosa26 (29). 
Sender-receiver cell contact triggered activa- 
tion of the synNotch pathway and transloca- 
tion of tTA into the nucleus, turning on the 
expression of Cre recombinase in the receiver. 
Cre then mediated irreversible Cre-loxP re- 
combination, leading to constitutive expression 
of tdT in the receiver and all of its descendants, 
thus permanently tagging them (Fig. 2A), even 
if the receiver no longer had contact with the 
sender or its descendants had never contacted 
any sender. Therefore, any contact event would 
genetically tag the receiver cell, allowing genet- 
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ic tracing of such receiver cells and their prog- 
enies even after they migrated away from the 
sender cells or differentiated or transformed 
into other cell lineages. We call mice carrying 
synNotch and Cre-loxP systems gTCCC mice 
for “genetic tracing of cell-cell contact.” 

We studied migration and fate transitions 
of endocardial ECs in the developing mouse 
heart. Cardiac ECs contact CMs during early 
embryonic heart development (30). In cardiac 
valve formation, these endocardial ECs then 
shift to a mesenchymal fate and migrate into 
the cardiac cushion through the endothelial- 
to-mesenchymal transition (EMT) program (30) 
(Fig. 2B). In mice, this dissociation of endo- 
cardial ECs away from CMs is already com- 
pleted before PO, when the cardiac cushion 
has been remodeled into cardiac valves and 
mesenchymal cells in the valve no longer ex- 
press any EC markers (Fig. 2B). We generated 
Tnnt2-mGFP; Cdh5-aGFP-N-tTA;tetO-Cre;R26- 
tdT or heart-gTCCC mice (Fig. 2A). The tdT 
signals that we observed in heart-gTCCC mice 
at E9.0 indicated that the cardiac ECs were 
successfully genetically labeled, and their mes- 
enchymal descendants maintained tdT expres- 
sion at E10.0 (Fig. 2, C and D). These data 
demonstrated that the genetic tracer tdT was 
permanently maintained in the receiver cells 
and all of their descendants, even when their 
descendants changed cell fate, e.g., due to EMT 
during cardiac valve development. 

Whole-mount fluorescence images of organs, 
immunostaining of tissue sections, and flow 
cytometric analysis revealed that most of the 
cardiac ECs retained tdT expression in the heart- 
gTCCC mice at PO, whereas no tdT expression 
was observed in their littermate controls of 
other genotypes (Fig. 2, E to H). We rarely de- 
tected tdT* ECs in other heart-gTCCC mouse 
organs examined, with the exception of the 
liver (see below) (fig. S11). Dox administration 
abolished tdT labeling of cardiac ECs in heart- 
gTCCC mice, demonstrating controllable ge- 
netic regulation by the Tet system (Fig. 2, E 
to H). At PO, even though the descendants of 
ECs that became mesenchymal cells in the 
valve were nowhere near CMs, they main- 
tained tdT expression (Fig. 21), which verified 
their EMT fate transition. These tdT* valve 
interstitial cells expressed mesenchymal cell 
marker platelet-derived growth factor recep- 
tor a (PDGFRa) (Fig. 21), but they no longer 
expressed the EC marker platelet EC adhesion 
molecule (PECAM) (Fig. 21). Thus, we could 
trace ECs even after they switched fate to 
mesenchymal cells. 

Endocardial ECs migrate at E8.5 to the near- 
by liver bud, where they contribute to the liver 
vasculature (37) (Fig. 2J). We found that a sub- 
set of ECs in postnatal liver had contact with 
CMs at an early embryonic stage. About 20% 
of ECs in livers collected from PO heart-gTCCC 
mice showed tdT fluorescence (Fig. 2, K and L, 
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and fig. S11B). We did not detect any tdT* ECs 
in the livers collected from PO Cdh5-aGFP-N- 
tTA;tetO-Cre;R26-tdT littermate controls (Fig. 
2K). Considering that no liver cells expressed 
mGFP synNotch ligand in the heart-gTCCC 
mice (fig. S11B), the tdT* ECs in the liver at PO 
likely originated from the heart (37), and these 
ECs probably contacted mGFP* CMs before 
their migration into the liver bud (Fig. 2J). 
Furthermore, these tdT* ECs were maintained 
in the adult liver, where they may contribute 
to the adult liver vasculature (fig. $12, A to D). 
We therefore were able to detect cell-cell com- 
munication temporally and spatially isolated 
during organogenesis and homeostasis (fig. 
S12, E and F). 


Genetic tracing of cell-cell contact history 
reveals outgrowth of tumor angiogenesis 


To further demonstrate the utility of gT™CCC 
for tracing cell contact histories, we studied 
vascular ECs in tumor angiogenesis, during 
which blood vessels are recruited from periph- 
eral tissues into hypoxic tumors (32-34). In this 
pathological condition, ECs migrate into 
tumors, and their contact history with tumor 
cells may confer them with particular capa- 
bilities of tumor blood vessels (34). We traced 
ECs that had contact with tumor cells and 
followed EC fate during tumor growth in vivo. 
The mouse lung tumor cell line TC-1 was en- 
gineered to constitutively express mGFP (TC-1- 
mGFP) as sender cells, and these cells were in 
direct contact with infiltrating ECs from host 
mice (fig. $13). To genetically trace tumor- 
infiltrating ECs, we implanted TC-1-mGFP 
tumor cells into Cdh5-aGFP-N-tTA;tetO-Cre; 
R26-tdT receiver mice (Fig. 3A) and collected 
tissues for analysis 7 and 14 days after im- 
plantation (Fig. 3B). ECs showed tdT fluo- 
rescence in TC-1-mGFP tumors at day 7 (Fig. 
3, C and D), but not in TC-1 control tumors 
(Fig. 3, C and D) or other organs at any stage 
analyzed in Cdh5-aGFP-N-tTA;tetO-Cre;R26- 
tdT mice (fig. S14). Bandeiraea simplicifolia 
lectin injection showed that these tdT* ECs 
were functionally connected to circulation (Fig. 
3E). These tdT* ECs were mainly found in 
tumors, not in the thin layer of peripheral 
capsule tissue, at day 7 (Fig. 3, F and G). 

At day 14, however, we observed tdT* ECs 
outside of the tumor, in the peripheral capsule 
tissues that were free of any mGFP* tumor 
cells (Fig. 3, H and I). The thick layer of capsule 
detected at day 14 was enriched in fibroblasts 
and macrophages (Fig. 3J), which may pro- 
duce angiogenesis factors to recruit ECs from 
within tumors. Indeed, we detected enriched 
vascular endothelial growth factor A (VEGFA) 
expression in the thick peripheral capsule 
at day 14 (Fig. 3J), which may partly ex- 
plain the strong recruitment of blood vessels 
by the tumor. We used Dox treatment to in- 
hibit the gTCCC system at different times after 
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Fig. 2. Genetic tracing of ECs that have had contact with CMs. (A) Schematic 
showing the gTCCC strategy for permanent genetic tracing of cardiac ECs 
that have had contact with CMs. (B) Illustration of the development of 

the endocardial cushion and cardiac valves (red). Green indicates CMs. 

(€ and D) Whole-mount fluorescence images of E9.0 (C) and E10.0 (D) Tnnt2- 
mGFP;Cdh5-aGFP-N-tTA;tetO-Cre;R26-tdT (heart-gTCCC) embryos and 
immunostaining for tdT and GFP. (E) Whole-mount fluorescence images of 
PO hearts collected from mice of different genotypes as indicated. Inserts are 
bright-field images. Dox was added to inhibit the Tet system. (F) Immuno- 
staining of heart sections for GFP, tdT, and PECAM. (G) Fluorescence-activated 
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cell sorting (FACS) analysis of tdT* cardiac ECs from hearts. (H) Quantification 
of the percentage of ECs expressing tdT. Data are shown as the mean + SEM; 
n = 6. (I) Immunostaining of heart sections for tdT and GFP, PECAM, or 
PDGFRa. Arrowheads indicate tdT* mesenchymal cells in valves (asterisks). 
(J) Illustration showing that ECs from the developing heart migrate to the liver 
bud at E8.5 and subsequently contribute to the liver vasculature. (K) Whole- 
mount fluorescence images of PO livers. (L) Immunostaining of PO liver sections 

for tdT and PECAM or tdT, CDH5, and HNF4a. Arrowheads indicate tdT* ECs. Scale 
bars: yellow, 400 pm; white, 100 pm. Each image is representative of six individual 
biological samples. Data are shown as mean + SEM. 


tumor implantation and found that tumor- 
infiltrating ECs at early stages (i.e., the first 
5 days) were genetically labeled and subse- 
quently moved to the tumor capsule (fig. S15). 
Thus, the ability of our gTCCC system to trace 
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the postcontact history of cells throughout 
their lifetimes enabled us to show that some 
vascular ECs, after dissociating from cancer 
cells, extended into the peripheral tissues with 
no tumor cells (Fig. 3K). 
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We sorted three types of ECs from im- 
planted tumors: tdT* tumor ECs (tdT* tECs), 
tdT* capsule ECs (tdT* cECs), and cECs lack- 
ing tdT fluorescence (tdT cECs) (Fig. 3L). 
Principal component analysis revealed three 
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Fig. 3. Genetic tracing of the tumor cell-EC interaction during tumor for fibroblast and macrophage markers. (K) Illustration showing that vessels 
growth. (A and B) Schematic showing the tumor-gTCCC strategy to trace tumor — expand from the periphery into the tumor (ingrowth) and tdT* vessels 
cell-EC contact history (A) and experimental design (B). (C) Immunostaining expand out of the tumor into the peripheral capsule (outgrowth) during tumor 
of tumor sections 7 days after tumor implantation (+7 day). Right panel growth and formation of the capsule. (L) Schematic diagrams illustrating 
shows quantification data for the percentage of ECs expressing tdT. (D) FACS the experimental strategy for RNA-sequencing experiments. (M) Principal 
analysis of the percentage of CD31" ECs expressing tdT in +7 day tumors. component analysis of GFP°CD45 Terll9°CD31* cell signatures from tECs and 
(E) Detection of perfused Bandeiraea simplicifolia lectin in tumor tdT* ECs cECs. n = 3. (N) Venn diagram of differentially expressed gene numbers in 
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and I) Quantification of the percentage of ECs expressing tdT in the tumor or the enriched pathways. Scale bars, 100 ym. Data are shown as mean + SEM. 
capsule at +7 (G) and +14 (I) days. (J) Immunostaining of +14 day tumor sections —__Each staining is representative of six individual biological samples. 
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different groups of ECs with distinct transcrip- 
tional profiles (Fig. 3M). Comparisons of gene 
expression between tdT* cECs and tdT* tECs 
highlighted enriched pathways for angiogene- 
sis, cell adhesion, the immune system process, 
blood vessel development, cellular response to 
VEGF stimulus, and regulation of cell migra- 
tion in tdT* cECs compared with those from 
tdT* tECs (Fig. 3, N and O, and fig. SI6A). Com- 
parison of gene expression between tdT* cECs 
and tdT cECs showed increased expression of 
genes associated with inflammatory responses, 
immune system processes, positive regulation 
of cell migration and angiogenesis, chemo- 
taxis, and regulation of cell adhesion in tdT* 
cECs versus tdT cECs. This suggested that an 
interaction between ECs and tumor cells may 
endow these ECs with particular properties 
in angiogenesis, migration, and the inflam- 
matory response (Fig. 3P and fig. S16B). It 
appears possible that blood vessels, when re- 
cruited into tumors, can grow out of tumors 
into peripheral capsules. 


Cre-induced R26-mGFP mouse line for 
broad spectrum of cell types as 
sender cells 


To broaden the application of the gLCCC and 
gTCCC systems to all cell types that contact a 
particular receiver cell type, we generated 
a knock-in mouse line, R26-mGFP, for Cre- 
dependent expression of mGFP ligand in 
sender cells (Fig. 4A). We inserted an mGFP 
transgene expression cassette into the ubiq- 
uitous Rosa26 locus, with the mGFP coding 
sequences following a loxP-flanked 3x polyA 
Stop cassette to render mGFP expression in- 
ducible upon Cre recombinase administration 
(Fig. 4A). Whole-mount fluorescence or sec- 
tional staining of E15.5 R26-mGFP embryos 
showed no GFP expression (Fig. 4B), indicat- 
ing no leakiness of GFP expression without 
Cre recombinase. By contrast, ACTB-Cre;R26- 
mGFP embryos, which had Cre-loxP-mediated 
excision in all cells, yielded GFP expression 
throughout (Fig. 4C), demonstrating that mGFP 
synNotch ligand expression in targeted cells 
was Cre dependent. 

To use R26-mGFP mice to study any other 
Cre* cell-EC contacts, we crossed them with 
Nestin-CreER, an inducible Cre driver in Nestin* 
cells that include neurons, epididymis vascu- 
lar smooth mucle cells, and myoblasts. We 
generated Nestin-CreER;R26-mGFP senders 
and Cdh5-aGFP-N-tTA;tetO-tdT receiver or 
Nestin-gLCCC mice (Fig. 4D). We treated the 
Nestin-gLCCC mice with tamoxifen (Tam) at 
E12.5 to induce Cre activation and found, at 
E15.5, that some GFP” cells expressed the neu- 
ronal nuclear protein (NeuN), and they were 
in proximity with cadherin 5-positive (CDH5*) 
ECs (Fig. 4, E to F). Fluorescence imaging of 
sagittal sections of E15.5 embryos showed that 
tdT signals mirrored the GFP expression pat- 
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tern, except in the brain, where there was no 
detectable tdT expression (Fig. 4G). The tdT* 
ECs were located in the trigeminal ganglion, 
dorsal root ganglion, and intercostal neurovas- 
cular bundle, but there were no tdT* ECs 
detected in the brain (Fig. 4H and fig. S17). In 
mice of the same genotype treated with corn 
oil (designated the “no Tam” group), neurons 
did not express GFP and ECs remained tdT™ 
in the trigeminal ganglion (Fig. 41). We con- 
firmed that tdT expression in ECs was con- 
trolled by the tet-regulated genetic system, 
because Dox administration repressed tdT 
expression in ECs of Tam-treated Nestin- 
gLCCC mice (Fig. 4J). Therefore, R26-mGFP 
is applicable for generating sender cells from 
any Cre-expressing cell type (Fig. 4K). 


Cre-induced H11-aGFP-N-tTA mouse line 
enabling any cell type to be receiver cells 


To find all of the cell types that interact with a 
particular sender cell, we generated a knock-in 
mouse line, H1I-aGFP-N-tTA, for Cre-dependent 
expression of the aGFP-N-tTA receptor in re- 
ceiver cells (Fig. 5A). We then inserted an 
aGFP-N-tTA expression cassette into another 
ubiquitous gene locus, HippII (35), with the 
aGFP-N-tTA coding sequences following a 
loxP-flanked 3x polyA stop cassette to render 
receptor expression inducible upon Cre re- 
combinase administration (Fig. 5A). Section- 
al staining of E15.5 H1I-aGFP-N-tTA embryos 
showed no Myc-aGFP (annotation of aGFP- 
N-tTA) expression (Fig. 5B), indicating no 
receptor expression without Cre recombi- 
nase. By contrast, Tie2-Cre;H1I-aGFP-N-tTA 
embryos, with Cre-loxP-mediated excision 
in all endothelial lineages, yielded receptor 
expression throughout (Fig. 5, C and D), dem- 
onstrating that aGFP-N-tTA synNotch re- 
ceptor expression in targeted cells was Cre 
dependent. 

To functionally characterize Hil-aGFP-N- 
tTA mice in studying cell-cell interactions, we 
crossed Tnnt2-mGFP mice with Tie2-Cre;H1I- 
aGFP-N-tTA mice and used tetO-tdT as the 
readout of their contacts (Fig. 5E). In Tnnt2- 
mGFP; H11-aGFP-N-tTA;tetO-tdT mice, we could 
not detect tdT expression in the heart unless the 
mice were crossed with the Tie2-Cre mouse line 
(Fig. 5F). Immunostaining of heart sections 
of Tnnt2-mGFP; Tie2-Cre;H11-aGFP-N-tTA;tetO- 
tdT mice for GFP and tdT showed tdT expres- 
sion in a vascular pattern (Fig. 5G). Most of the 
ECs in the atrium and ventricle were tdT”, ex- 
cept for endocardial ECs on the valves, where 
they were anatomically separated from CMs 
(Fig. 5H). No tdT* ECs were detected in other 
organs or tissues (Fig. 51), indicating that the 
genetic activity of receptor-expressing cells 
also depends on ligand from sender cells, e.g., 
mGFP* CMs. Therefore, HlJ-aGFP-N-tTA is ap- 
plicable for generating receiver cells from any 
Cre-expressing cell type (Fig. 5J). 
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One Tigre-synNotch mouse allele to unravel any 

cells that contact Cre* cells 

To facilitate the broad application of an inter- 
cellular genetic approach, we generated a Cre- 
induced synNotch allele on the tightly regulated 
(Tigre) genomic locus (36), resulting in a Tigre- 
synNotch mouse that could simultaneously 
express synthetic ligand in Cre* cell lineages 
and receptor in Cre’ cells. In this Tigre-synNotch 
design, tetO-rox-Stop-rox-tdT-insulator-CAG- 
loxP-aGFP-N-tTA-pA-loxP-mGFP was knocked 
into the Tigre gene locus by CRISPR-Cas9 
such that the receptor, ligand, and tTA read- 
outs were all combined in one mouse allele 
(Fig. 6A). By crossing this Tigre-synNotch mouse 
with any Cre mouse, Cre-loxP recombination 
would remove the loxP-flanked aoGFP-N-tTA 
cassette, leading to mGFP ligand expression in 
Cre* sender cells, whereas Cre” cells continued 
to express oGFP-N-tTA receptor as receivers 
(Fig. 6B). CAG-Dre was used to remove rox- 
Stop-rox in Tigre-synNotch allele (Fig. 6B). If 
any Cre’ cells were in contact with Cre* cells, 
then tTA would activate tdT in these aGFP-N- 
tTA* receiver cells (Fig. 6B). The generation of 
Tigre-synNotch facilitated the genetic detec- 
tion of in vivo cell-cell contacts using a single 
Cre-responsive mouse allele. 

To test this Tigre-synNotch allele, we crossed 
it with the TEK receptor tyrosine kinase (Tie2)— 
Cre line (37) and collected Tie2-Cre;CAG-Dre; 
Tigre-synNotch mouse tissues for analysis. 
Compared with littermate control CAG-Dre; 
Tigre-synNotch, we could detect patterning of 
tdT robustly in Tie2-Cre;CAG-Dre;Tigre-synNotch 
mice (Fig. 6C). Immunostaining of Tie2-Cre; 
CAG-Dre;Tigre-synNotch mouse tissues such as 
heart and lung for GFP and PECAM revealed 
GFP expression in PECAM* ECs (Fig. 6D). Im- 
munostaining of tissue sections for GFP and 
tdT showed a particular patterning of tdT that 
lined along GFP” cells throughout different 
organs or tissues (Fig. 6E). For example, these 
GFP* ECs contacted PDGFR§* pericytes in the 
brain, TNNT2* CMs in the heart, PDGFRo* 
fibroblasts in the lung, and hepatocytes in the 
liver, marking these diverse types of cells as 
tdT* (Fig. 6F). By contrast, we detected very 
few tdT* cells in CAG-Dre;Tigre-synNotch mouse 
tissues (Fig. 6G). Additionally, we used induc- 
ible Cre lines to label their neighboring cells 
as tdT™. For instance, after induction with a 
low dose of Tam, Nestin-CreER- or CAG-CreER- 
labeled sparse mGFP* cells activated tdT ex- 
pression only in their neighboring cells (Fig. 
6H and fig. S18). These results demonstrated 
that Tigre-synNotch enables the detection of 
in vivo cell-cell contacts broadly in any given 
cell type in a Cre mouse line (Fig. 61). 


Simultaneous application of g_CCC and gTCCC 
in one mouse 


To enable the simultaneous detection of on- 
going cell-cell contact and contact history in 
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Fig. 4. Generation of the R26-mGFP mouse line for Cre-induced mGFP in 
sender cells. (A) Schematic showing the generation of the R26-mGFP allele by 
homologous recombination using CRISPR-Cas9. (B) Whole-mount or sectiona 
fluorescence images of E15.5 R26-mGFP embryos. Insert, bright-field whole- 
mount image. (C) Whole-mount or sectional fluorescence images of E15.5 ACTB- 
Cre;R26-mGFP embryos. Insert, bright-field whole-mount image. (D) Schematic 
showing the design of the Cre-induced gLCCC system for study of the interaction 
between Cre* cells and receiver cells, e.g., the Nestin* cell-EC interaction. 

(E) Schematic showing the experimental design by Tam induction of Cre (Nestin- 
CreER). (F) Immunostaining of an E15.5 tissue section for GFP, NeuN, and CDH5 
showing that GFP is specifically expressed in neurons. (G) Immunostaining of 
E15.5 whole embryonic sections for GFP and tdT. Regions 1 to 4 are magnified in 
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(H). (H) Immunostaining of E15.5 tissue sections for GFP, tdT, and CDH5. Three- 
dimensionally reconstructed images show neurons (GFP) and their interacting 
ECs (tdT). The percentage of ECs expressing tdT in each region is quantified 
on the right panels. Data are shown as the mean + SEM; n = 5. (1) Immunostaining of 
E15.5 embryonic sections from mice that received no Tam for GFP, tdT, and CDH5. 
Quantification data are shown in the right panel. Data are shown as the mean + SEM; 
n= 5. (J) Dox treatment inhibits tTA binding to tetO, preventing tdT gene activation. 
Shown is immunostaining of E15.5 tissue sections from the trigeminal ganglion for 
tdT, GFP, and CDH5 in Nestin-CreER;R26-mGFP,Cdh5-aGFP-N-tTA;tetO-tdT mice 
treated with Dox and Tam. (K) Illustration showing Cre-induced mGFP expression in 
sender cells. Scale bars: yellow, 1 mm; white, 100 pm. Each image is representative 
of five individual biological samples. 
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Fig. 5. Generation of the H11-aGFP-N-tTA mouse line for Cre-induced 
aGFP-N-tTA in receiver cells. (A) Schematic showing the generation of the 


H11-aGFP-N-tTA allele by homologous recombination using CRISPR-Cas9. 


(B) Immunostaining of H1l-aGFP-N-tTA embryonic sections for the Myc tag. 


(C) Schematic showing Tie2-Cre-induced aGFP-N-tTA expression (left). Shown is 


immunostaining of Tie2-Cre;H11-aGFP-N-tTA sections (right) for the Myc tag. 


(D) Immunostaining of brain, heart, lung, and liver sections from E15.5 Tie2-Cre; 


H11-aGFP-N-tTA embryos for the Myc tag and CDH5. (E) Schematic showing 


the design of the Cre-induced gLCCC system for study of the interaction between 
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sender and Cre* receiver, e.g., the CM-EC interaction. (F) Whole-mount 
fluorescence images of hearts from E15.5 embryos. Insert, bright-field whole- 
mount images. (G and H) Immunostaining of £15.5 heart sections for GFP and 
tdT (G). Regions 1 to 4 are magnified in (H). The percentage of ECs expressing 
tdT in each region is quantified on the right panels. Data are shown as the 
mean + SEM; n = 5. (1) Immunostaining of other tissues of E15.5 embryos for 
GFP and tdT. (J) Illustration showing Cre-induced aGFP-N-tTA expression in 
receiver cells. Scale bar: yellow, 1 mm; white, 100 pm. Each image is 
representative of five individual biological samples. 
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the Tigre-synNotch allele by homologous recombination using CRISPR-Cas9. 
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bars: yellow, 1 mm; white, 100 tm. Each image is representative of five individual 
biological samples. Data are shown as mean + SEM. 
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one mouse, we generated a tetO-Dre-BFP re- 
porter line [where Dre is another site-specific 
recombinase targeting specific 32bp DNA site 
rox (38), and BFP is blue fluorescent protein], 
by which an insulator-flanked tetO-Dre-IRES- 
BFP-polyA cassette was knocked into the 
Rosa26 locus using CRISPR-Cas9 (fig. S19A). 
In this design, when a mGFP* receiver cell and 
an aGFP-N-tTA* sender cell come into contact, 
tTA activates Dre-BFP expression, marking re- 
ceiver cells as BFP* (gLCCC; fig. S19B). Using 
R26-rox-tdT (Rosa26-rox-Stop-rox-tdTomato), 
Dre-rox recombination results in permanent 
genetic tracing of receiver cells by tdT (gTCCC; 
fig. SI9B). We used cardiac EC-CM contact to 
test this simultaneous gLCCC and gTCCC sys- 
tem using Tnnt2-mGFP;Cdh5-aGFP-N-tTA;tetO- 
Dre-BFP;R26-rox-tdT mice. Using whole-mount 
fluorescence and sectional immunostaining, 
we found that BFP and tdT were both ex- 
pressed in CDH5* ECs in the heart (fig. S19, 
C and D). In mice lacking either synthetic li- 
gand or receptor, we did not detect any BFP or 
tdT in cardiac ECs (fig. S19, C and D), suggest- 
ing that simultaneous gLCCC and gTCCC ac- 
tivation requires contact with both the synNotch 
ligand and the synNotch receptor. 

We next used BFP (for gLCCC) and tdT (for 
gTCCC) to distinguish ongoing cell-cell con- 
tact and contact history in one mouse. Con- 
sidering the established models of endocardial 
EMT in valve formation (30) and endocardial 
contribution to the liver vasculature (37), we 
examined BFP and tdT reporter in valve mes- 
enchymal cells and ECs in the liver. Immuno- 
staining of heart and liver sections showed 
that tdT was still expressed in valve mesen- 
chymal cells and liver ECs when they were no 
longer in contact with GFP* CMs, but they 
did not express any BFP (fig. S19, E and F). 
Coronary ECs that were in contact with CMs 
expressed both BFP and tdT (fig. S19, E and F). 
These data demonstrate that ECs that are in 
contact with CMs remain tdT*BFP*, whereas 
ECs that dissociate with CMs remain tdT*BFP™ 
when they have undergone EMT or migrate to 
the liver (fig. S19G). Thus, tetO-Dre-BFP enabled 
us to reveal ongoing cell-cell contacts and their 
contact history with distinct reporters simulta- 
neously in one mouse (fig. S19H). 


Discussion 


We used synNotch modules (5, 6, 10) to de- 
velop an in vivo strategy for monitoring dy- 
namic cell-cell contacts and for tracing cell 
contact history in mice (table S1 and S2). These 
two systems can also be applied simultane- 
ously in one mouse if different reporters are 
selected for separate genetic readouts, en- 
abling both monitoring of ongoing cell-cell con- 
tacts and tracing of contact history in diverse 
biological contexts. One question is whether 
the affinity of synNotch ligand-receptor bind- 
ing could influence cell-cell dissociation. 
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Previous studies with the synNotch strategy 
(5, 6, 10) may alleviate this concern. We also 
chose a low-affinity version of the aGFP nano- 
body LaG17 (24) to further mitigate this con- 
cern. We studied cardiac tissues because the 
developing heart offers clear spatial segrega- 
tion and cell fate transitions to test the per- 
formance of our system with a natural readout 
in a physiological context (Fig. 2B). A second 
example of endocardial ECs’ migration away 
from the heart to the liver bud and subsequent 
contribution to the liver vasculature supports 
the observation of a natural development pro- 
gram (31) by our gTCCC system (Fig. 2J). 
Direct cell-cell contact is needed to activate 
synNotch, because both the sender lysate and 
purified mGFP failed to activate the synNotch 
system in receiver cells (fig. S6), consistent with 
previous studies showing that the mechanical 
force exerted by sender cells through contact is 
required for Notch activation in receiver cells 
(5, 39). It should also be noted that some cel- 
lular contact may not necessarily induce phys- 
iological cell-cell communications or regulation 
of cell fate plasticity. The biological significance 
of the migration of endocardial cells to the liver 
and the outgrowth of tumor blood vessels 
merits further investigation, but these results 
are a good demonstration of what the tech- 
nology described here can do. 

The dynamic monitoring capacity of our 
gLCCC system uncovered dynamic interactions 
of ECs with other cell lineages in development, 
tissue homeostasis, and pathological condi- 
tions. The tracing ability of the gTCCC system 
detected the contribution of endocardial cells 
to liver vasculature and tumor vessel outgrowth. 
The reporter readouts of gLCCC and gTCCC 
differ in sensitivity, because the reporter in 
gLCCC reflects the strength of tTA-mediated 
gene activation, whereas the reporter in gTCCC 
is a binary readout after permanent recombi- 
nation and is ubiquitously active due to the 
Rosa26-CAG promoter (table S1 and S2). In 
this study, we also generated three special mouse 
lines for Cre-induced expression of synNotch 
ligand or receptor. Although the R26-mGFP 
and H1l-aGFP-N-tTA mouse lines enable 
synNotch ligand and receptor expression, re- 
spectively, in any type of cell with a Cre driver, 
Tigre-synNotch further made possible ligand 
and receptor expression in Cre* cells and all 
other Cre’ cells in one mouse line, allowing us to 
explore the cell-cell contact map broadly with- 
out predefined cell types in advance (table S1). 
Contact-dependent labeling could also be use- 
ful for studying stem cell-niche cell interac- 
tions and their functions in normal and disease 
contexts. In addition to unraveling cell-cell 
communication, the contact-dependent cell 
tracing demonstrated here also offers a poten- 
tial new method for cell fate mapping in vivo, 
providing positional information on cells that 
migrate away from their origins and/or adopt 
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a new fate during development, homeostasis, 
regeneration, and disease. 
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QUANTUM SIMULATION 
Observing the quantum topology of light 


Jinfeng Deng'}, Hang Dong*t, Chuanyu Zhang’, Yaozu Wut, Jiale Yuan’, Xuhao Zhu’, Feitong Jin’, 
Hekang Li, Zhen Wang?>, Han Cai", Chao Song’*, H. Wang’?>*, J. Q. You", Da-Wei Wang!?4* 


Topological photonics provides a powerful platform to explore topological physics beyond traditional 
electronic materials and shows promising applications in light transport and lasers. Classical degrees of 
freedom are routinely used to construct topological light modes in real or synthetic dimensions. Beyond the 
classical topology, the inherent quantum nature of light provides a wealth of fundamentally distinct topological 
states. Here we implement experiments on topological states of quantized light in a superconducting circuit, 
with which one- and two-dimensional Fock-state lattices are constructed. We realize rich topological 
physics including topological zero-energy states of the Su-Schrieffer-Heeger model, strain-induced 
pseudo-Landau levels, valley Hall effect, and Haldane chiral edge currents. Our study extends the 
topological states of light to the quantum regime, bridging topological phases of condensed-matter 
physics with circuit quantum electrodynamics, and offers a freedom in controlling the quantum states 
of multiple resonators. 


he quantum Hall effect (7) reveals new 
phases of matter that are classified by the 
topological invariants of energy bands (2). 
For two-dimensional electrons in strong 
magnetic fields, the chiral edge states 
between Landau levels contribute to the quan- 
tized Hall conductivity, which is immune to 
local defects. This topological effect can also 
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exist without Landau levels, such as in the 
Haldane model (3), which lays the basis for 
topological insulators (4). The optical simula- 
tion of quantum Hall edge states (5) opens 
a new research area, topological photonics 
(6-8), which brings a wealth of applications 
in routing and generating electromagnetic 
waves, such as backscattering-free waveguides 
(9) and topological insulator lasers (10). Clas- 
sical degrees of freedom such as frequencies 
and orbital angular momenta have been widely 
used to synthesize new lattice dimensions to 
embed the topological modes (71-13). Such 
pure classical topology of light is in stark con- 
trast to the topological phases of electrons, 
where the quantum wave and fermionic sta- 
tistics play a fundamental role. Intriguingly, 
new topological states emerging from light 


quantization and bosonic statistics have been 
predicted beyond classical interpretation (14-17). 
Recent development in circuit quantum electro- 
dynamics (QED) (78) makes it possible to real- 
ize these intrinsic quantum topological states 
of light, which provide quantum degrees of 
freedom in engineering photonic topology 
(14, 19) and offer topological control knobs 
in bosonic quantum information processing 
(20-23). 

Compared with lattices of modes in real or 
synthetic dimensions in classical topological 
photonics, the topological states of quantized 
light are embedded in lattices of Fock states 
II;|n;) with n; being the photon number in the 
ith mode. In the Fock-state lattice (FSL), a 
mode provides a dimension (14, 19, 20, 24), 
in contrast to a site in traditional lattices, in- 
cluding those in synthetic dimensions (11-73, 25). 
The FSLs exploit the infinite quantum Hilbert 
space of light, enabling the construction of 
high-dimensional lattices with only a few 
cavity modes. To sketch such dimensional 
scalability, we use the Jaynes-Cummings (JC) 
model (26), which describes the interaction 
between a two-level atom with quantized light. 
However, here we use multiple quantized light 
modes to couple the atom. With two light modes, 
the Fock states form one-dimensional (1D) lat- 
tices of the Su-Schrieffer-Heeger (SSH) model 
(Fig. 1A) (27). By adding just one other mode, 
we obtain two-dimensional (2D) strained honey- 
comb lattices (Fig. 1B) (28). These lattices are 
featured by site-dependent coupling strengths, 
which originate from the property of the boso- 
nic annihilation operator a 


aln) = Vln —1) (a) 


For the vacuum state, a|0) = 0, which leads to 
natural edges of FSLs when the photon number 
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Fig. 1. Fock-state lattices of multimode Jaynes-Cummings models. 

(A) Topological transport of the zero-energy state of the SSH FSL with N = 5. 
The sublattice sites of s = ¢ ({) are denoted by squares (circles) and labeled by nynz, 
the photon numbers in R; and Ro. The thicknesses of the lines connecting 
neighboring sites are proportional to the coupling strengths t; (red) and tz (blue). 
The wave function envelopes of four zero-energy states are schematically drawn 
with different colors. (B) The valley Hall response and the Haldane chiral edge 
state in a 2D FSL with N = 10. An excited qubit is coupled to three resonators 
with different photon numbers n; (left). The coupling strengths are proportional 
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to \/n; + 1, which introduces competition between resonators to obtain a photon 
from the qubit. All the Fock states with the same N are coupled by the JC 
Hamiltonian to form a honeycomb lattice (right). The inhomogeneous coupling 
strengths induce an effective magnetic field in the 2D FSL. The VHE is featured 
by the wave packets at the two valleys moving in opposite directions 
perpendicular to an applied force (the black arrow). A Lifshitz topological edge 
(dashed line) separating the semimetallic and insulator phases locates on the 
incircle, which can host the Haldane chiral edge states (yellow wave packet with 
the arrow indicating the moving direction). 
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in one of the cavities reduces to zero. FSLs also 
have topological edges that host zero-energy 
states, resulting from the competition between 
resonators in exchanging photons with the 
atom. Such a simple mechanism enables FSLs 
to realize several important models in topo- 
logical physics, in particular the seminal SSH 
and Haldane models, which have been the 
focus in various quantum platforms (29-36). 
Here we demonstrate adiabatic transport of 
topological zero-energy states in 1D SSH FSLs, 
where Fock states are topologically transferred 
from one cavity to another while maintaining 
the quantumness in superposition states. In 
2D FSLs, we observe the valley Hall effect (VHE) 
(37) and the Haldane chiral edge current (38), 


A 


which offer a topological route of engineering 
quantum states of multiple resonators. 
Leveraging the advantageous integrability 
and tunability of the circuit QED platform 
(39-42), we design and fabricate a superconduc- 
ting circuit to build and engineer the FSLs. 
The key elements of the circuit are a central 
gmon qubit (43) (Q,) and three resonators (R; 
with j running from 1 to 3), all with tunable 
frequencies. Each resonator R, is coupled to Qo 
through an inductive coupler (C;) (Fig. 2A). 
The coupling strengths g;/2n can be contin- 
uously tuned by changing the magnetic flux in 
C;. In addition, each resonator R, is capacitive- 
ly coupled to an ancilla qubit Q; for the prep- 
aration and readout of the resonator state. 


Q.XY 


Other characteristics of the resonators and 
qubits can be found in the supplementary 
materials. 

The Hamiltonian of the coupled system of 
R;s and Qo can be described by a multimode JC 
model (26) in the rotating-wave approximation 


d 
ho 
H=->% + ) hoya} aj + 
jal 


d 
) hg; (ota, + ajo" ) (2) 
jal 


where a; is the annihilation operator of R; with 
the transition frequency ,;, o* =|) (|| and 
o =|) (t| are the raising and lowering operators 
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Fig. 2. Adiabatic transport of the topological zero-energy states in the Fock- 
state Su-Schrieffer-Heeger model. (A) False-color circuit image of the device of 
this experiment. Inset: Symbolized configuration of the key elements, a central gmon 
qubit Qo (green circle) coupled to three resonators (cyan, red and yellow squares 
for Rj, Ro, and R3) via tunable couplers (blue twin circles). (B) Experimental pulse 
sequences for the adiabatic transport. We prepare the initial Fock state of Ro by 
repeatedly exciting its ancilla qubit Qo with a -pulse and tuning it in resonance with 
Ro to swap the photons (upper panel). After the initialization, we tune R;, Ro, and 
Qo in resonance (middle panel) and modulate C;, C2 to tune the coupling strengths 
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) (cyan line) and g> (red line) (lower panel). Finally, we measure the joint population 
of Ry, Ro, and Qo. (C) The observed evolution of the zero-energy state wave packet 
in the numerical simulation (upper panel) and experiment (lower panel). Obviously |Y¥o) 
only occupies the ||) sublattice. In numerical simulation, we use the parameters 
of the resonators and qubit listed in the table S1. All data in this paper, except 
that for quantum state tomography, are averaged over five runs of experiments. 
(D) The two-mode Wigner function of the resonator state at t = 300 ns in the plane- 
cut along axes Re(a)-Re(az) and Im(a,)-Im(az), and the fidelity F = 0.735 (see 
fig. S6 for pulse sequence of tomography and more data at other times). 
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Fig. 3. The pseudo-Landau levels in the 2D Fock-state lattice with N = 5. (A) The evolution of the 
excited-state population of the qubit Qo. (B) Fast Fourier transform of the Rabi oscillation. The vertical axis is 
the frequency component divided by 2. The solid line is the numerical simulation, and the circles are the 
experimental data. (C) The eigenstates in the zeroth and positive pseudo-Landau levels of the 2D FSL, with 
eigenenergies corresponding to the Fourier peaks. Each point labels an eigenstate characterized by the 
chirality C. The degeneracy of the nth Landau level is N—n+1. 


of Qo with the transition frequency wo, and d is 
the number of resonator modes. The Hamil- 
tonian conserves the total excitation number 


N= ) ny + (6, +1)/2, where n; is the 
J 
photon number of R; and o, = |t)(t| — |1) (JI 


Topological transport 


For d = 2and N excitations, 2N + 1 states 
|8; 1, M2) are coupled in a bipartite tight- 
binding lattices with the spin states s = fT, | 
labeling the two sublattices (Fig. 1A). When 
Qo is resonant with both resonators, all these 
2N + 1states have the same energy, which is 
set as the zero energy. Because the coupling 
strengths ¢; = g;,/nj (J = 1, 2) depend on the 
photon numbers, 4 > f and t < tf on the 
left- and right-hand sides of the FSL, resulting 
in two different topological phases of the SSH 
model (27). A topological zero-energy state 
locates around the lattice sites satisfying 
t, = ta, which is the topological edge of the 
SSH model. We write g; = god; where go is a 
fixed nonzero coupling strength and A,’s are 
the tunable parameters satisfying A? + A3 = 1. 
The topological zero-energy state can be writ- 
ten as a two-mode binomial state (14, 44) 


N 


; (3) 


which only occupies the ||) sublattice. By 
adiabatically tuning A, from 0 to 1, we can 
transport the topological zero-energy state 


968 2 DECEMBER 2022 + VOL 378 ISSUE 6623 


from the right end of the lattice to the left end, 
or vice versa. 

In the experiment we select R, and R, to 
construct the SSH FSL, with Rz being far 
detuned and effectively decoupled from the 
system. In the experimental pulse sequences 
(Fig. 2B), we first prepare the initial state 
||; 0,5), which is the topological zero-energy 
state of the SSH FSL with N = 5 and j, = 1, 
by pumping five photons successively into Rg 
via Qs (upper panel of Fig. 2B). Then we tune 
Ry, Ro, and Qo in resonance at the frequency 
int /2n ~ 4.81 GHz and sinusoidally modulate 
the coupling strengths where go /21 ~ 9 MHz, 
di = |cos(2nvt)|, and A» = |sin(2nvt)| with 
v = 416 kHz «gp to satisfy the adiabatic 
condition (lower panel of Fig. 2B). Finally, the 
wave packet of the zero-energy state in the 
FSL is measured (see supplementary mate- 
rials), with the data shown in Fig. 2C. The 
adiabatic transport of the topological edge state 
is witnessed by the oscillation of the photons 
between R, and R, following Eq. 3 with time- 
dependent A; and Ay. The zero-energy state is 
topologically protected by the energy gap go 
from other eigenstates of the FSL and main- 
tains coherence during the transport (45). To 
show this, we further measure the density 
matrix of the two resonators by quantum state 
tomography (Fig. 2D). The two-mode binomial 
state remains a Fock state in the combinational 
dark mode of the two resonators, A.@, — \4@o, 
and the quantumness of the states is evident 
from the negative values of the two-mode 
Wigner functions. 


Valley Hall effect 

When d = 3, the Fock states in the subspace 
with N excitations form a two-dimensional 
honeycomb lattice containing (N + 1)’ sites 
(Fig. 1B). The site-dependent coupling strengths 
introduce a strain, which has the effect of a 
magnetic field (46-48) and results in \/n-scaling 
pseudo-Landau levels (4) when Q, is reso- 
nant with all three resonators. We observe 
the Landau levels by analyzing the spectra 
of the lattice dynamics (49). In the experi- 
ment, we prepare the initial state ||; 0,5, 0) 
and resonantly couple R;, Ry, and Rz to Q) with 
coupling strengths g;/2n = 9 MHz. We mea- 
sure the evolution of the probability of finding 
Qo in the |‘) state and then perform fast Fourier 
transform. We obtain peaks approximately 
located at \/NQo with Qo = V3g; (Fig. 3C). The 
degenerate states in the same Landau level are 
distinguished by their chiralities 


C=bib, —btb_ (4) 


3 
where b. = Ss Giexp(Fi2,j/3)/V3 are 
the annihilation operators of the two chiral 
dark modes that are decoupled from the qubit. 
The chirality C plays the role of the lattice mo- 
mentum in conventional lattices, and C = N 
and C = —N correspond to the two corners of 
the Brillouin zone, denoted as K and K' val- 
leys, respectively (14, 50). 

A Lifshitz topological edge (57) on the in- 
circle separates the FSL into two phases, a 
semimetallic phase within the incircle and a 
band insulator phase outside of it (see the 
dashed line in Fig. 1B). The states in the zeroth 
Landau level are confined within the incircle 
by an outside band gap of the insulator (14). 
In the semimetal, the strain-induced magnetic 
field has opposite signs at the K and K’ valleys 
(see supplementary materials). By introduc- 
ing a linear potential to mimic the effect of an 
electric field to electrons, we can observe the 
VHE (Fig. 1B); ie., the Hall response has oppo- 
site signs at the two valleys. To experimentally 
demonstrate this effect, we first prepare an 
initial state |‘%o) with Ay = A» = 1/2 in Eq. 3 
following the procedure in Fig. 2B. Such an 
initial state on the Lifshitz topological edge is 
a Gaussian wave function in the zeroth Landau 
level (74). Then we bring Rg in resonance with 
Qo and set g;/2n = 9 MHz for j = 1,2, 3. The 
linear potential with horizontal gradient is in- 
troduced by slightly shifting the frequencies of 
Ry and Ro 


V = hd(ata, — ajay) (5) 


where the detuning 5/2n ~ 1.8 MHz. We then 
measure populations on each lattice site and 
obtain the average photon numbers in the 
three resonators (Fig. 4B). The linear potential 
drives photons from R,; and Ry to R3 whereas 
the qubit stays in the ground state. To visualize 
the evolution of the wave function, we draw the 
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Fig. 4. The valley Hall effect in the 2D Fock-state lattice. (A) The pulse 
sequences for controlling the frequencies (upper panel) and coupling 
strengths (lower panel). We first prepare the initial state |“¥o) through the 
opological transport in the SSH FSL. Then we tune R3 and Qo in resonance 


+ 


the wave function is perpendicular to the direction of the effective fo 
(black arrow). (D) The evolution of the average photon numbers in the 


K'valley 
Re 


correspond to the states with all photons in R;, Ro and R3. The radius of the 
blue circle on each site is proportional to its population. The trajectory of 


ce 


at wint /2n = 4.81 GHz while we detune R; and Rz to introduce the linear 
potential. Meanwhile we set the coupling strengths g;/2n = 9 MHz for j = 1, 2,3, 
and finally we measure the joint populations at different times during the 


three resonators for the coherent initial state |W.) = |; a,0,—a) with a =1.8. 


We detune R, and R3 to introduce a linear potential V = n8 (ala - alas). In 


simulations with the detuning 6/2x = 1.80 MHz. (C) 
fo 


population distributions in the FSL at five dif- 
ferent times (Fig. 4C). The wave function first 
moves upward perpendicular to the force di- 
rection (black arrow) until being reflected 
by the Lifshitz topological edge near the top 
vertex, and then moves downward back to the 
initial state (up to a phase factor). In particu- 
lar, when the wave function is at the center 
of the lattice but in different valleys—e.g., at 
t = 150 and 350 ns—it moves in opposite di- 
rections, which is a signature of the VHE (J4). 
In contrast to the valley Hall effect in photonic 
lattices, where edges are routinely needed for 
the experimental implementation (52, 53), 
here we coherently transport the quantum 
states to the two valleys and directly mea- 
sure the valley Hall drift, thanks to the high 
tunability, controllability, and readability of 
the superconducting circuit. It is noteworthy 
that the qubit remains in the ground state 
during the evolution, which reflects a funda- 
mental difference between classical and quan- 
tum predictions (fig. $5). 
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evolution. (B) The valley Hall evolution of the average photon numbers in 
the three resonators for an initial two-mode binomial state |¥o) with N = 5. 
The squares are experimental data, and the dashed lines are numerical 


N = 5 at t = 0, 150, 250, 350, and 480 ns. The left, right, and top vertices 


he populations in the FSL 


Surprisingly, the VHE can also be observed 
with initial classical states such as |‘¥,) = ||; 
o.,0,—a); iLe., Ry and Rs are in the coherent 
states |o.) and|—a) and Ry is in the vacuum state. 
This state can be expanded as a superposition 
of two-mode binomial states with different 
total excitation numbers N (4). Owing to the 
synchronized dynamics in different subspaces, 
the fields in the three resonators remain as 
a direct product of coherent states, and the 
evolution of the average photon numbers in 
the three resonators follows curves similar 
to that for an initial binomial state (Fig. 4D 
and supplementary materials). 

The states in the two valleys are identified 
by their chiralities. Because the states of the 
three resonators are separable for coherent 
initial states, we perform simultaneous quan- 
tum state tomography and obtain their Wigner 
functions (Fig. 4E). As expected, the phases 
are distributed in a counterclockwise (C > 0) 
and clockwise (C < 0) manner at t = 100 and 
290 ns when the wave function moves to the 


the numerical simulation (dashed lines), we set 6/2x = 2.35 MHz. (E) The 
measured Wigner functions of the three resonator states at time t = 
290 ns (see fig. S3 for the numerical simulation). The phases of the 
amplitudes of the Wigner functions are labeled on the unit circles, which show 
the chirality of the corresponding states in the two valleys. 


100 and 
argest 


K and K valleys, respectively. Therefore, the 
VHE in FSLs can be used to coherently trans- 
port the wave function between two valleys 
and control the chirality of the quantum states 
of multiple resonators. 


Haldane model 


By introducing a Floquet modulation of the 
coupling strength, g;(¢) = go + 2gasin|vat + 
2(j — 1)x/3], we synthesize a most important 
model in topological physics, the Haldane 
model (33, 35, 36, 38). The effective Hamil- 
tonian in the second-order perturbation is (see 
supplementary materials) 


3 
He = feo y (a o + he.) +h«o.C (6) 


j=l 


where k = —3g% /va. The second term in Eq. 6 
introduces the complex next-nearest-neighbor 
hoppings in the FSL (14, 20) and transforms 
flat Landau levels to a two-band structure with 
gapless chiral edge states, which originate from 
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Fig. 5. Chiral edge currents of the Fock-state Haldane model. (A) The energy bands of the Hamiltonian 
Hy in Eq. 6 with total excitation number N = 10. The two energy bands are connected by chiral edge 


states, with each dot indicating an eigenstate. The pop’ 


ulations of a binomial state (with A; = 2) on the chiral 


edge states are proportional to the radii of the shaded circles. (B) The control sequence in realizing the 
Haldane Hamiltonian. We prepare an initial state ||; ,—o,0) with o =1 and tune three resonators and 
the gmon qubit on resonance at wint/2n = 4.82 GHz, followed by a Floquet modulation of the coupling 


strengths g;(t) with static amplitude go /2n = 2.5 MHz, 


dynamic amplitude gy /2x = 3.25 MHz, and modulation 


frequency vy/2n = 40 MHz, such that the effective Haldane coupling strength «/2n = —0.79 MHz. (C) Chiral 


edge currents shown by the average photon numbers 


in the three resonators. The total average photon 


number in the initial state is 2. The gray triangle shows the boundary of the FSL with N = 2, which is the most 
occupied subspace at the initial time. The circles show the experimental data, and the depths of the 
colors indicate the evolution time. Dashed lines are numerical simulations in the ideal case, whereas solid 


lines are those considering relevant parameter imperf 


the zeroth Landau level (Fig. 5A). In the ex- 
periment, we directly excite R, and R, to obtain 
an initial state ||; a, —o,0) (see Fig. 5A for its 
distribution in the subspace NV = 10). Then we 
periodically modulate the coupling strengths 
g;(¢) to realize the Haldane Hamiltonian (see 
the control sequence in Fig. 5B). The average 
photon numbers are subsequently measured 
as a function of time (J7), which shows the 
chiral motion; i.e., the wave function rotates in 
a counterclockwise manner in the FSL (Fig. 5C). 
Ideally the wave function shall be on the in- 
circle, i.e., the Lifshitz topological edge. In the 
experiment, the chiral rotating wave function 
moves toward the center of the FSL owing 
to the decoherence and nonlinearity of the 
resonators, as well as the imperfect controlling 
pulses (figs. S8 to S11). 


Concluding remarks 


In this work, we have demonstrated the co- 
herent control of topological zero-energy states 
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ections as described in the supplementary materials. 


in 1D and 2D FSLs. These states only occupy 
the sublattice where the qubit is in the ||) 
state, and they are protected from other eigen- 
states by an energy gap of the vacuum Rabi 
frequency. Perturbations with energy smaller 
than this gap, such as slow modulation of 
coupling strengths and small detunings be- 
tween the resonators, are used to coherently 
control the zero-energy state to realize topologi- 
cal transport and VHE. Floquet modulations 
are introduced to realize the Haldane chiral 
edge currents. The techniques that we have 
developed in this study can also be applied 
to control other eigenstates in the FSL, such 
as the excited states in higher Landau levels. 
Our methods can be generalized to investi- 
gate topological states of more complex qubit- 
resonator coupled systems, where the number 
of resonators determines the dimension of the 
FSLs and each state of the qubits labels a sub- 
lattice, with richness beyond known topologi- 
cal phases in condensed-matter physics. Our 


study paves the way for investigating topologi- 
cal phases in FSLs and developing new control 
methods for quantum state engineering of 
bosonic modes. 
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Phosphoenolpyruvate reallocation links nitrogen 
fixation rates to root nodule energy state 


Xiaolong Ke??, Han Xiao’??, Yaqi Peng’, Jing Wang’, Qi Lv'?, Xuelu Wang??>* 


Legume-rhizobium symbiosis in root nodules fixes nitrogen to satisfy the plant’s nitrogen demands. The 
nodules’ demand for energy is thought to determine nitrogen fixation rates. How this energy state is sensed 
to modulate nitrogen fixation is unknown. Here, we identified two soybean (Glycine max) cystathionine 
B-synthase domain-containing proteins, nodule AMP sensor 1 (GmNAS1) and NAS1-associated protein 

1 (GmNAPI). In the high—nodule energy state, GmNAS1 and GmNAP1 form homodimers that interact with 
the nuclear factor-Y C (NF-YC) subunit (GmNFYC10a) on mitochondria and reduce its nuclear accumulation. 
Less nuclear GmNFYC10a leads to lower expression of glycolytic genes involved in pyruvate production, which 
modulates phosphoenolpyruvate allocation to favor nitrogen fixation. Insight into these pathways may help 
in the design of leguminous crops that have improved carbon use, nitrogen fixation, and growth. 


egumes have evolved specialized nitrogen- 

fixing organs called root nodules by es- 

tablishing symbiotic relationships with 

rhizobia, which require a large amount 

of extra energy for nitrogen fixation. The 
symbiotic nodules obtain photoassimilates 
(mainly sucrose) and metabolize them through 
glycolysis to produce phosphoenolpyruvate 
(7-3). Phosphoenolpyruvate is converted into 
either malate to fuel atmospheric nitrogen 
fixation in bacteroids or pyruvate for adeno- 
sine triphosphate (ATP) production in mito- 
chondria for nitrogen assimilation and other 
cellular activities (fig. SI), and its allocation 
likely regulates nodule nitrogen fixation ca- 
pacity (4-10). Soybean (Glycine max) nodule 
nitrogen fixation capacity is low under an- 
aerobic or phosphorus-deficient conditions but 
increases when the supply of oxygen or phos- 
phorus increases as the nodule energy state 
changes (11, 12). As the nodule energy state 
decreases after transfer to darkness or high 
nitrate supply, leguminous nodule nitrogen 
fixation capacity also decreases (13-16). Given 
the high energy needs of symbiotic nitrogen 
fixation, it is likely that nodule energy-state 
changes regulate nodule nitrogen fixation ca- 
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pacity. How nodule energy state is sensed and 
how the information regulates nodule nitrogen 
fixation capacity has been unclear. 
Cystathionine B-synthase (CBS) domains are 
conserved protein modules that can bind ade- 
nosyl compounds, such as adenosine mono- 
phosphate (AMP), adenosine diphosphate 
(ADP), ATP, and S-adenosyl methionine, to reg- 
ulate biological processes by associating with 
other functional regions of CBS domain- 
containing proteins (CDCPs) (17-21). AMP- 
activated protein kinases, with o, B, and y 
subunits, sense cellular energy status and 
respond to low-energy stress in mammalian 
cells (22, 23). The greater AMP concentration 
associated with low-cellular energy states can 
enhance AMP binding to the CBS domains of 
the y subunit of AMP-activated protein kinases 
to allosterically activate its catalytic a sub- 
unit, thereby initiating downstream signal- 
ing to maintain cellular energy homeostasis 
(18, 19). CBSX proteins contain a pair of CBS 
domains, bind to AMP, and activate thiore- 
doxins in different organelles to maintain 
cellular redox homeostasis in Arabidopsis 
(Arabidopsis thaliana) (24), suggesting a role 
for CDCPs in plant cellular energy sensing. 


GmNASI and GmNAP!I sense nodule energy state 
to regulate nodule nitrogen fixation capacity 


To identify CDCPs that might sense nodule 
energy state in soybean, we examined the 
expression patterns of 71 soybean CDCP genes 


using Soybean eFP Browser (25, 26). We found 
that the CDCP gene GmCBS22 was highly ex- 
pressed in nodules, flowers, and leaves, whereas 
its close homolog GmCBSI4 was specifically 
expressed only in root nodules (fig. S2A). 
GmCBS22 and GmCBSI4 are highly expressed 
in mature nodules, with GmCBS22 showing a 
broad expression and GmCBSI4 expressed spe- 
cifically in the nodule infection zone and vas- 
cular bundles (fig. S2, B to N). Knockdown of 
GmCBS22 and GmCBSI4 in hairy roots de- 
creased nodule nitrogenase activity by about 
50%, whereas nodule number and weight were 
unaffected (fig. S3, A and B), indicating that 
GmCBS22 and GmCBSI4 may regulate nodule 
nitrogen fixation. 

GmCBS22 and GmCBSI4 are predicted to 
encode proteins with an N-terminal chloroplast 
transit peptide (cTP), four tandem CBS do- 
mains, a Phox and Bem1 (PB1) domain, anda 
C-terminal transmembrane region (TMR) (Fig. 
1A and fig. S4, A and B). We expressed cTP- 
GFP-GmCBS22 [GmCBS22 fused to green 
fluorescent protein (GFP) after the cTP] or 
GmCBS14-GFP-TMR (GmCBS14 fused to GFP 
before the TMR) in Nicotiana benthamiana 
leaf epidermal cells and soybean nodules and 
observed localization to mitochondria (Fig. 1B 
and fig. S4C), which was further confirmed by 
subcellular fractionation (fig. S4D). When the 
C-terminal TMR was conformationally ob- 
scured by cyan fluorescent protein (CFP) in 
GmCBS22-CFP and GmCBS14-CFP fusion pro- 
teins, CFP fluorescence accumulated in the 
nucleus and the cytoplasm (fig. S4E), indicat- 
ing that the C-terminal TMR is required for 
mitochondrial localization of GmCBS22 and 
GmCBS14. We noticed occasional ring fluores- 
cence patterns for cTP-GFP-GmCBS22 and 
GmCBS14-GFP-TMR in N. benthamiana leaf 
epidermal cells, suggesting their localization 
to the mitochondrial outer membrane (fig. 
S4F) (27). 

We then tested the ability of GmCBS22 and 
GmCBS14 proteins to bind various adenylates 
and found that one GmCBS22 molecule can 
bind one AMP with a dissociation constant (Ka) 
of 3.94 uM and ADP with a Kg of 40.32 uM 
(Fig. 1, C and D), whereas ATP and cyclic AMP 
(cAMP) did not bind to GmCBS22 (fig. S5A). 
By contrast, GmCBS14 did not bind to any of 
the tested adenylates (fig. S5B). Deletion of any 
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Fig. 1. GmNAS1 and GmNAP!1 are nodule energy sensors that regulate the 
response of nodule nitrogen fixation capacity to nodule energy state. 

(A) Schematic diagram of the GmCBS22 and GmCBS14 proteins. (B) GmCBS22 and 
GmCBSI4 localization in nodule cells. Nodules of pGmCBS22:cTP-GFP-GmCBS22 
and pGmCBS14:GmCBS14-GFP-TMR transgenic hairy roots were sectioned to 
observe GFP fluorescence. Mitotracker was used to stain mitochondria; mCherry 
indicates infected cells containing the strain USDA110-mCherry. Scale bars are 

20 um. (C and D) Isothermal titration calorimetry analysis of GmCBS22 binding 
to AMP (C) or ADP (D). The black lines (bottom) are the best fit to the one-site 
model. (E) Pull-down assays of His-GmCBS22, His-GmCBS14, GST-GmCBS22, 


of the four CBS domains in GmCBS22 abol- 
ished its AMP binding ability (fig. S6, A to D). 
The secondary structures of GmCBS22 and 
GmCBS14 were unaffected by the addition of 
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AMP, ADP, ATP, or cAMP (fig. S7, A and B). 
Like many other CDCPs (27), GmCBS22 and 
GmCBS14 formed homodimers and hetero- 
dimers (Fig. 1E and fig. S8). Recombinant 


cr-nas1 cr-nap1 


cr-nas1nap1-1 cr-nas1nap1-2 


and GST-GmCBS14. The experiment was performed three times with comparable 
results. (F) Changes in ATP, ADP, and AMP contents and energy charge of 
W82 nodules after sucrose treatment. Data are means + SD of four biological 
replicates. FW, fresh weight. (G) Nodule nitrogenase activity of W82 and GmNASI 
and GmNAPI mutants without and with sucrose treatment. Boxes represent 
the first quartile, median, and third quartile, and whiskers represent minimum 
and maximum values. Significant differences were determined by Student's 

t test (*P < 0.05, and ***P < 0.001) in (F) or by one-way analysis of variance 
(ANOVA) and post hoc Tukey's test, with different lowercase letters indicating 
significant differences (P < 0.05) in (G). 


His-GmCBS22 lacking any CBS domain still 
interacted with GmCBS22 and GmCBSI4 tagged 
with glutathione S-transferase (GST) (fig. S9A), 
whereas deletion of the PB1 domain of GmCBS22 
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eliminated the homodimerization of GmCBS22 
and heterodimerization of GmCBS22 and 
GmCBS14, and deletion of the PB1 domain 
of GmCBSI14 also eliminated the homodimeri- 
zation of GmCBS14 and heterodimerization 
of GmCBS22 and GmCBSI14 (fig. S9B). More- 
over, the addition of AMP enhanced forma- 
tion of GmCBS22-GmCBSI14 heterodimers but 
blocked formation of GmCBS22 homodimers 
(Fig. 1E and fig. S8B). Therefore, GmCBS22 
and GmCBS14 may function as energy sen- 
sors in soybean nodules by directly binding 
to AMP and forming dynamic dimers on the 
mitochondrial membrane. We therefore re- 
named GmCBS22 as soybean nodule AMP 
sensor 1 (GmNAS1) and GmCBS14 as NAS1- 
associated protein 1 (GmNAP1). 

Under our growth conditions, sucrose treat- 
ment, which significantly increased nodule 
energy state (Fig. IF), can enhance nodule nitro- 
genase activity in soybean “Williams 82” 
(W82) plants (Fig. 1G). To investigate whether 
GmNAS1 and GmNAPI can sense nodule en- 
ergy state to regulate nodule nitrogen fixa- 
tion capacity, we created knockout mutants 
of GmNASI and GmNAP1, namely, cr-nas1, 
cr-nap1, cr-nasinapI-1, and cr-nasInap1-2 
(fig. S10, A and B). Under our growth condi- 
tions, these mutants showed similar nodule 
number, weight, and nitrogenase activity rela- 
tive to the wild type (Fig. 1G and fig. S10C). 
However, although the nodule energy state of 
these mutants was similar to that of the wild 
type (fig. S11, A to D), sucrose treatment failed to 
enhance nodule nitrogen fixation capacity in all 
the mutants (Fig. 1G), indicating that GmNASI 
and GmNAPI mediate the linkage between 
nitrogen fixation capacity and nodule energy 
state (fig. S11, E to I). Changes in light intensity 
affected photosynthesis, nodule energy state, 
and nitrogen fixation capacity (fig. $12, A to F). 
Knockout of GmNASI and GmNAPI also elim- 
inated the increase in nodule nitrogen fix- 
ation capacity that follows enhanced light 
intensity (fig. S12F). Together, these results 
show that GmNAS1 and GmNAP!1 are needed 
for nodule nitrogen fixation capacity to re- 
spond to changes in nodule energy state. 


GmNAS1 and GmNAP1 regulate GmNFYC10a 
nuclear localization 


To elucidate how GmNAS1 and GmNAPI reg- 
ulate nodule nitrogen fixation capacity in re- 
sponse to nodule energy state, we conducted 
co-immunoprecipitation (co-IP) assays followed 
by tandem mass spectrometry (MS/MS). We 
identified additional GmNAP1 interactors, in- 
cluding a nuclear factor-Y C (NF-YC) subunit, 
GmNFYC10a (fig. S13, A and B, and data S1). 
Considering the role of NF-YC subunits in 
symbiotic nitrogen fixation (28, 29), we tested 
and confirmed the interaction of GmNAS1 and 
GmNAPI with GmNFYC10a (Fig. 2, A and B). 
The addition of AMP largely blocked the in- 
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teraction between GmNASI and GmNFYC10a 
but not that between GmNAP1 and GmNFYC10a 
(Fig. 2B). GmNAS1 lacking any CBS domain 
still interacted with GmNFYC10a, and the ad- 
dition of AMP hardly affected their interaction 
(fig. S14.A). Given that AMP impaired GmNAS1 
homodimerization (Fig. 1E), we hypothesized 
that GmNAS1 or GmNAP1 homodimerization is 
required for their interaction with GmNFYC10a. 
Indeed, GmNAS1 and GmNAP1 lacking the 
PB1 domain failed to interact with GmNFYC10a 
(fig. S14B). Moreover, AMP-promoted GmNASI- 
GmNAPI heterodimerization (Fig. 1E and fig. 
S8B) also facilitated the dissociation of GmNAP1 
homodimers and inhibited its interaction with 
GmNFYC10a (Fig. 2C). Interaction of GmNAS1 
and GmNAPI1 with GmNFYC10a was inhib- 
ited by AMP (Fig. 2D) but enhanced by sucrose 
in vivo (fig. S15). 

Unlike the fluorescence that was mainly 
observed on mitochondria during bimolecular 
fluorescence complementation (BiFC) assays 
of GmNFYC10a and GmNASI1 and GmNAP1 
(Fig. 2A), GmNFYC10a-GFP only localized to the 
nucleus (Fig. 2E), suggesting that GmNFYC10a 
is proximal to mitochondria through its inter- 
action with GmNASI and GmNAP1. Because 
higher AMP levels can weaken the interaction 
of GmNAS1 and GmNAPI with GmNFYC10a 
(Fig. 2, B to D), we hypothesized that a low- 
nodule energy state may enhance GmNFYC10a 
translocation from mitochondria to nuclei. 
Indeed, oligomycin, an inhibitor of mitochon- 
drial ATP synthase, or AMP treatment signif- 
icantly promoted the nuclear accumulation of 
GmNFYC10a-GFP in nodule cells (Fig. 2F). 
More GmNFYC10a-GFP localized to the cyto- 
plasm in the cr-nas/ and cr-nap1 nodules than 
in W82 nodules under mock conditions (Fig. 
2F and fig. S16, A and B), confirming that 
GmNAS1 and GmNAPI1 maintain GmNFYC10a 
localization to mitochondria. Oligomycin or 
AMP treatment promoted GmNFYC10a-GFP 
nuclear localization in c7-nap! but not in cr-nas1 
mutants (fig. S16, A and B), suggesting that 
GmNAS1 but not GmNAPI can primarily re- 
spond to increased AMP to enhance GmNFYC10a 
nuclear accumulation. GmNFYC10a-GFP accu- 
mulated in the cytoplasm and nucleus in most 
nodule cells of the cr-nasInapI-2 mutant, re- 
gardless of whether they were treated with 
oligomycin or AMP (fig. S16C). Taken together, 
these results demonstrate that GmNAS1 and 
GmNAPI regulate the GmNFYC10a nuclear 
accumulation in response to changing nod- 
ule energy state. 


GmNFYC10 regulates glycolysis for 
pyruvate production 


We then created the knockout mutant cr-nfyci0 
through genome editing (fig. S17A) as well as 
gGmNFYCl0a-Flag overexpression plants and 
found that neither showed changes in nodule 
number, weight, or nitrogenase activity rela- 


tive to the wild type under normal growth 
conditions (Fig. 3, Aand B, and fig. S17, B 
to D). However, unlike the wild type, the 
cr-nfycl0 and gGmNFYC10a-Flag plants did 
not show the sucrose-enhanced nodule nitro- 
gen fixation capacity (Fig. 3, A and B), indicating 
that the proper expression level of GmNFYCIO 
is required for efficient nitrogen fixation under 
the high-nodule energy state. 

To explore how the GmNASI-GmNAPI- 
GmNFYC10 module regulates nodule nitrogen 
fixation capacity, we conducted RNA sequenc- 
ing (RNA-seq) analysis of Ri-GmNASI-NAP1 
(Ri-GmCBS22-14) nodules (Fig. 3, C and D). 
Kyoto Encyclopedia of Genes and Genomes 
(KEGG) analysis (30) of the down-regulated 
genes showed a variety of biological processes 
regulated by GmNASI and GmNAPI (Fig. 3E). 
Considering the pivotal role of glycolysis in 
nodule energy supply (fig. S1) (7-3), we ana- 
lyzed 26 down-regulated genes involved in the 
glycolysis-gluconeogenesis pathway (data S2). 
Ten glycolytic genes among these genes con- 
tain five encoding pyruvate kinases (PKs) (Fig. 
3F and fig. S18, A and B). Expression of enolase 
genes in glycolysis was not down-regulated in 
Ri-GmNASI-NAPI nodules (fig. S18C). Promo- 
ter analysis of 10 down-regulated glycolytic 
genes revealed a CCAAT element, the bind- 
ing site of the NF-Y transcriptional complex 
(fig. S19A) (37). Furthermore, we determined 
that GmNFYC10a binds to the PK7a, GAPCI, 
and PK2a promoters and activates their ex- 
pression (Fig. 3, G and H). Expression of most 
glycolytic genes was down-regulated in c7-nfyclO 
nodules (fig. S1I9B). Expression of several glyco- 
lytic genes was reduced in the Ri-GmNFYCIO 
nodules (fig. S19, C and D), whereas expres- 
sion of most glycolytic genes was higher in 
the gGmNFYCI10a-Flag nodules (fig. S19, E 
and F). Given that half of the regulated gly- 
colytic genes encode PKs (Fig. 3F), we exam- 
ined pyruvate contents in Ri-GmNFYCIO and 
gGmNFYC10a-Flag nodules. Compared with 
wild type nodules, pyruvate contents were 
about 40% lower in the Ri-GmNFYCIO nod- 
ules and 50% higher in the gGmNFYCI10a- 
Flag nodules (fig. S19G). Knockdown of PK7, 
GAPCI, or PK2 decreased nodule nitrogenase 
activity without affecting nodule number or 
weight (fig. S20, A and B). Together, these find- 
ings indicate that GmNFYC10, regulated by 
GmNAS1 and GmNAPI, can activate glycolysis 
for pyruvate production in soybean nodules. 


The GmNAS1-GmNAP1-GmNFYC10 module 
regulates PEP allocation in nodules 


Because a reduced nodule energy state can 
promote GmNFYC10a nuclear accumulation 
(Fig. 2F), we determined that sucrose treat- 
ment leads to diminished nuclear accumula- 
tion and enhanced mitochondrial localization 
of GmNFYC10a (Fig. 4A). Expression levels of 
the glycolytic genes activated by GmNFYC10 
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Fig. 2. GmNAS1 and GmNAP1 mediate GmNFYC10a nuclear accumulation in 
response to nodule energy state. (A) BiFC assays of GmNAS1 and GmNAPI and 
GmNFYC10a. The mt-rk plasmid was co-infiltrated into N. benthamiana leaves to 
label mitochondria with mCherry fluorescence. YFP, yellow fluorescent protein. 

(B) Pull-down assays of GmNAS1 and GmNAPI and GmNFYC10a. (€) Pull-down assay 
of GmNAPI1 and GmNFYC10a in the presence of GmNASI. The grayscale ratio of GST- 
GmNFYC10a to His-GmNAPI1 (GST/His) with AMP was normalized to that without 
AMP; the result of three biological replicates is shown in the scatterplot to the right. 
(D) Co-IP assays of GmNAS1 and GmNAPI1 and GmNFYC10a. The numbers under the 
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(22-fold) in Ri-GmNASI-NAPI nodules. (E) KEGG analysis of down-regulated genes 
(22-fold) in Ri-GmNASI-NAPI nodules. The red frame indicates the glycolysis- 
gluconeogenesis pathway. (F) Schematic diagram of the glycolytic pathway. 

Ten down-regulated glycolytic genes in Ri-GmNASI-NAPI nodules are indicated. 
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PK1a 


were also down-regulated after sucrose treat- 
ment (Fig. 4B), which did not happen in the 
cr-nasl, cr-napl, cr-nasiInapI-1, and cr-nfycl0 
mutants (fig. S21, A to D). In agreement with 
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adenine dinucleotide; NADP, nicotinamide adenine dinucleotide phosphate; PFK, 
phosphofructokinase. (G) ChIP-qPCR (chromatin immunoprecipitation—quantitative 
polymerase chain reaction) analysis of GmNFYC10a binding to the PKla, GAPCI, 
and Pk2a promoters. NF-Y binding sites and DNA fragments for qPCR analysis are 
indicated by green circles and short horizontal lines, respectively. Data are means + 
SD of three biological replicates. TSS, transcription start site. (H) Transcriptional 
activation of the PKla, GAPCI, and PK2a promoters by GmNFYC10a. Firefly 
luciferase (LUC) activity was normalized to Renilla luciferase (REN) activity. Data 
are means + SD of three biological replicates. Significant differences were 
determined by one-way ANOVA with post hoc Tukey's test (P < 0.05) in (A) 
and (B) or by Student's t test (*P < 0.05, **P < 0.01, and ***P < 0.001) in (G) and 
(H). In (A) and (B), boxes represent the first quartile, median, and third quartile, 
and whiskers represent minimum and maximum values. 


(Fig. 4C). Although several glycolytic genes 
were also down-regulated in sucrose-treated 
leaves of wild type, the down-regulation was 
unaffected in sucrose-treated leaves of cr-nas1, 
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Fig. 4. GmNAS1, GmNAP1 and GmNFYC10 regulate glycolysis to modulate 
PEP allocation in response to nodule energy state. (A) GmNFYC10a-Flag 
protein abundance in the nucleus and mitochondria of nodule cells. CYC1, 
actin, and histone H3 were used as mitochondrial, cytoplasmic, and nuclear 
markers, respectively. The experiment was performed three times with 
comparable results. (B) Relative expression levels of glycolytic genes in W82 
nodules without and with sucrose treatment. (C) Relative expression levels of 
glycolytic genes in W82, cr-nas1, cr-nap1, and cr-nasInap1-1 nodules after sucrose 
treatment. (D and E) Pyruvate (D) and 2-PG (E) contents in W82, cr-nasl, 
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cr-napl, and cr-nasInap1-1 nodules after sucrose treatment. (F) Relative 
expression levels of enolase genes ENO2a, ENO2b, ENO2c, and ENO2d in W82, 
cr-nasl, cr-nap1, and cr-nasInap1-1 nodules after sucrose treatment. (G to 1) PEP 
(G), OAA (H), and malate (H) contents and ratio of pyruvate to OAA (I) in 
W82, cr-nasl, cr-napl, and cr-nasInap1-1 nodules after sucrose treatment. Data in 
(D), (E), and (G) to (I) are means + SD of at least four biological replicates. 
Significant differences were determined by Student's t test (*P < 0.05, **P < 
0.01, and ***P < 0.001) in (B), (C), and (F) or by one-way ANOVA with post hoc 
Tukey's test (P < 0.05) in (D), (E), and (G) to (1); ns is not significant. 
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cr-napl, and cr-nasInapI-1 (fig. S21E). Sucrose 
treatment also reduced PK2a protein content 
and PK activity in W82 nodules, but the de- 
crease was eliminated in the cr-nasI, cr-napl, 
and cr-nasInapI-1 nodules (fig. S21, F and G). 
Therefore, the pyruvate contents in the c7-nas1, 
cr-napli, and cr-nasInapI-1 nodules were higher 
than that in wild-type nodules after sucrose 
treatment (Fig. 4D). Furthermore, we found 
that the 2-phosphoglycerate (2-PG) and phos- 
phoenolpyruvate (PEP) contents were far lower 
than the pyruvate and oxaloacetic acid (OAA) 
contents in nodules (Fig. 4, D, E, G, and H), and 
expression of ENO2a, ENO2b, ENO2c, and 
ENO2d in the GmNASI and GmNAPI mutants 
was similar to that in the wild type (Fig. 4F), 
suggesting that most of the 2-PG and PEP was 
converted into PEP, and pyruvate and OAA in 
nodules, respectively, and the total amount of 
PEP converted into pyruvate and OAA in W82 
nodules should be similar to that in the cr- 
nasi, cr-nap!1 and cr-nasInapI-1 nodules. OAA 
and malate contents in these mutant nodules 
were lower (Fig. 4H), and the ratio of pyruvate 
to OAA in mutant nodules was increased (Fig. 
41). Moreover, under our growth conditions, 
the pyruvate, OAA, and malate contents in the 
cr-nasl, cr-nap1, and cr-nasiInapI-1 nodules 
were similar to those in W82 nodules (fig. S21, 
H to J), which is in line with the unaffected 
nodule nitrogen fixation capacity of these mu- 
tants under the same growth conditions (Fig. 
1H). Therefore, in response to an increased 
nodule energy state, GmNAS1 and GmNAP1 
reduce GmNFYC10 nuclear accumulation, which 
suppresses glycolysis and pyruvate produc- 
tion, thereby modulating PEP allocation to 
favor nodule nitrogen fixation. 


Discussion 


Leguminous plants regulate high-energy- 
consuming symbiotic nitrogen fixation to op- 
timize carbon utilization for sustaining growth 
under different environments (32). In this 
study, we identified GmNASI and GmNAP1 
as nodule-specific energy sensors in soybean. 
Under our growth conditions, a limited su- 
crose supply keeps nodule cells in a relatively 
low-energy state with high AMP levels, which 
promotes the formation of GnNAS1-GmNAP1 
heterodimers and leads to the nuclear accu- 
mulation of GmNFYC10, driving glycolysis 
and pyruvate production (fig. S22, A and B). 
Upon additional sucrose supply, when nodule 
energy state rises as AMP levels fall, GmNAS1 
and GmNAP!1 mainly form homodimers that 
maintain GmNFYC10 on mitochondria to re- 
duce its nuclear accumulation, leading to 
lower pyruvate production and more PEP al- 
located to OAA (fig. S22, C and D). Thus, the 
ratio of PEP converted into pyruvate and 
OAA is largely dependent on nodule energy 
state in the wild type (36/64 at the low- 
nodule energy state versus 3/97 at the high- 
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nodule energy state) to ensure basic cellular 
activities at the low-nodule energy state 
and to power nitrogen fixation at the high- 
nodule energy state (fig. S22, A and C). In the 
cr-nasInap1 mutant, enhanced allocation of 
PEP into OAA at the high-energy state was 
attenuated (pyruvate/OAA = 36/64 at the low- 
nodule energy state versus 12/88 at the high- 
nodule energy state) (fig. S22, B and D). 
Therefore, GmNAS1 and GmNAP1 are acti- 
vated by the high-energy state under adequate 
carbohydrate supply to hold GmNFYC10a on 
the mitochondria, thereby enhancing PEP 
allocation to OAA and malate for nitrogen 
fixation, which is different from the canon- 
ical energy sensors AMPK, SNFI1, and SnRK1 
that are activated by the low-energy state in 
response to lack of nutrients (22, 33, 34). Al- 
though GmNASI and GmNAPI knockouts did 
not affect nodule nitrogen fixation capacity 
under normal growth conditions (Fig. 1G), 
GmNASI and GmNAPI knockdowns in hairy 
roots impaired nodule nitrogen fixation capac- 
ity (fig. S3). The nodule energy state of hairy 
roots is lower than that of normal plants (fig. 
$23A), which could be caused by the retarded 
plant growth of hairy roots (fig. S23B). The 
up-regulated genes in hairy root nodules 
showed that many stress-related pathways 
are activated (fig. $23, C to E), suggesting that 
GmNASI and GmNAPI may also play a role 
in maintaining nodule nitrogen fixation ca- 
pacity under stressful conditions. Phylogenetic 
analysis showed that GmNAS1 and GmNAP1 
and their homologs within the ureide-exporting 
legumes form an independent cluster (fig. 
$24A and data S3) and that their homologs in 
Phaseolus vulgaris, but not in Lotus japonicus 
and Medicago truncatula, can complement 
the mutant nodule phenotypes of cr-nas7 and 
cr-nap! (fig. S24B). Therefore, this set of en- 
ergy sensors likely emerges in the ureide- 
exporting legumes to ensure elaborate energy 
use during nitrogen fixation, which may be 
necessary because of enhanced de novo purine 
biosynthesis and one-carbon metabolism in 
nodules of these legumes (6, 9). Our findings 
show how legume nodule energy state mod- 
ulates nodule performance through GmNAS1 
and GmNAP!1 and reveal targets for designing 
crops efficient in both carbon utilization, sym- 
biotic nitrogen fixation, and growth under 
varying environmental conditions. 
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Exceptional fracture toughness of CrCoNi-based 
medium- and high-entropy alloys at 20 kelvin 


Dong Liu’+, Qin Yu2}, Saurabh Kabra®, Ming Jiang’, Paul Forna-Kreutzer’, Ruopeng Zhang*®, 
Madelyn Payne*®, Flynn Walsh2*, Bernd Gludovatz®, Mark Asta®*, Andrew M. Minor“, 
Easo P. George’®°, Robert 0. Ritchie”** 


CrCoNi-based medium- and high-entropy alloys display outstanding damage tolerance, especially at cryogenic 
temperatures. In this study, we examined the fracture toughness values of the equiatomic CrCoNi and 
CrMnFeCoNi alloys at 20 kelvin (K). We found exceptionally high crack-initiation fracture toughnesses of 262 
and 459 megapascal-meters” (MPa-m) for CrMnFeCoNi and CrCoNi, respectively; CrCoNi displayed a crack- 
growth toughness exceeding 540 MPa-m” after 2.25 millimeters of stable cracking. Crack-tip deformation 
structures at 20 K are quite distinct from those at higher temperatures. They involve nucleation and restricted 
growth of stacking faults, fine nanotwins, and transformed epsilon martensite, with coherent interfaces that 
can promote both arrest and transmission of dislocations to generate strength and ductility. We believe 

that these alloys develop fracture resistance through a progressive synergy of deformation mechanisms, 
dislocation glide, stacking-fault formation, nanotwinning, and phase transformation, which act in concert to 
prolong strain hardening that simultaneously elevates strength and ductility, leading to exceptional toughness. 


igh-entropy alloys (HEAs) have attracted 
increasing attention in the metallurgy 
community as a class of metallic mate- 
rials that derive their properties from 
the presence of multiple principal ele- 
ments, rather than from a single dominant con- 
stituent as in most traditional metallic alloys 
(e.g., Fe in steels). Inspired by two seminal 
papers (J, 2), the field has grown to encompass 
equiatomic as well as nonequiatomic alloys, 
single-phase solid solutions, and multiphase 
compositionally complex alloys, with the goal 
of finding combinations of properties that 
differ from those of conventional alloys (3-7). 
One prominent group of such materials are 
the single-phase, face-centered cubic (fcc), equi- 
atomic alloys based on the CrCoNi system. 
Among these, the equiatomic CrMnFeCoNi 
alloy is the most characterized of all HEAs 
(1, 6-11). It came to prominence because its 
room-temperature strength and ductility can 
be substantially enhanced at liquid nitrogen 
temperature (8, 9) without compromising 
toughness (0). Moreover, its crack-initiation 
fracture toughness, Kj;,, remained at roughly 
220 MPa-m” over the temperature range 293 


to 77 K, with a crack-growth toughness, K,,, 
of >300 MPa-m” (after 2.25 mm of crack 
extension) (70). More recent experiments on 
this HEA showed a similar increase in strength 
and toughness (the latter measured in terms 
of the absorbed deviatoric strain energy) when 
the strain rate was increased from 10° s + 


1000 
A 


CrMnFeCoNi = ky. = 262 MPa‘m™* 


5 


Kes = 383 MPa-m" 


2. 


Jamegral, J (kim) 


00 os 10 15 20 25 
Crack extension, Aa (mn) 


oO 


500 


as 


400 


300 


1School of Physics, University of Bristol, Bristol BS8 1TL, UK. 
*Materials Sciences Division, Lawrence Berkeley National 
Laboratory, Berkeley, CA 94720, USA. °ENGIN-X, ISIS 
Facility, Rutherford Appleton Laboratory, Harwell Campus, 
Oxon OX11 0QX, UK. “Department of Materials Science and 100 
Engineering, University of California, Berkeley, CA 94720, 0 5 100 150 
USA. °National Center for Electron Microscopy, Molecular 
Foundry, Lawrence Berkeley National Laboratory, Berkeley, 
CA 94720, USA. °School of Mechanical and Manufacturing 
Engineering, University of New South Wales (UNSW Sydney), 
Sydney, NSW 2052, Australia. ’Materials Science and 
Technology Division, Oak Ridge National Laboratory, Oak 


Fracture Toughness (MPa 


200 
Temperature (K) 


250 300 350 


J (kailm’) 


(quasistatic compression) up to extremely high 
rates of 6 x 10° s* (dynamic shear) (71). 

There have been several derivatives of the 
CrMnFeCoNi alloy (12-14), most notably the 
single-phase equiatomic CrCoNi medium- 
entropy alloy (MEA), which displays even bet- 
ter properties. At 77 K, this MEA was found to 
have a Ky. of 273 MPa-m” and a K,, exceed- 
ing 400 MPa-m” (15). Although strength and 
toughness are often mutually exclusive prop- 
erties (16), the CrCoNi alloy exhibits excep- 
tionally high damage tolerance, with fracture 
toughness values among the largest ever re- 
ported. Such CrCoNi-based multiple principal 
element alloys are clearly strong candidate 
materials for potential applications in extreme 
environments, such as at very high strain rates 
and cryogenic temperatures. 


Results 


Given their exceptional damage tolerance, 
we investigated the mechanical properties of 
CrCoNi and CrMnFeCoNi alloys at even lower 
temperatures (~20 K) by performing uniaxial 
tensile tests and nonlinear elastic J-based 
fracture toughness tests (17) in a liquid helium 
environment (using the testing setup shown 
in figs. S1 to S3). Impact tests on the CrCoNi 
alloy have reported high Charpy V-notch en- 
ergies of close to 400 J at 77 K, which were 
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Fig. 1. J-R curves and fracture toughness values for the CrCoNi and CrMnFeCoNi alloys as a function 
of temperature. J-R curves showing the variation in the J-integral as a function of crack extension Aa for 
(A) the CrMnFeCoNi HEA and (B) the CrCoNi MEA, between room temperature (RT, ~293 K) and 20 K. 
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Corresponding K-based fracture toughness values back-calculated from the R-curves are shown in (C) for 
CrMnFeCoNi and (D) for CrCoNi, where Ky, represents the crack-initiation toughness and K,, the crack- 
growth toughness, defined at the ASTM E1820 maximum limit of valid crack extension where Aa = 2.25 mm. 
Note how the toughness of both alloys at 20 K is higher than at other temperatures. The toughness values for 
the CrCoNi alloy are believed to be among the highest toughnesses ever reported. 
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reduced by ~10% at 4.2 K (18). However, it 
remains unclear how samples that contain a 
sharp crack would perform at temperatures 
below 77 K, where anomalies in the temper- 
ature dependence of strength and ductility 
have been reported (19-22). Furthermore, full 
resistance-curve measurements that define 
both the crack-initiation and crack-growth 
fracture toughness have not been performed 
on medium- or high-entropy alloys at low tem- 
peratures approaching that of liquid helium. 
In addition to measurements of the crack- 
initiation and crack-growth toughnesses (and 
their corresponding stress intensity-based 
values), we performed in situ neutron diffrac- 
tion measurements and extensive postfracture 
electron backscatter diffraction (EBSD) analy- 
sis, fractography, and particularly transmis- 
sion electron microscopy to examine in detail 
the salient plastic deformation mechanisms 
and defect behavior that represent the fun- 
damental basis of their exceptional fracture 
resistance, which we find progressively in- 
creases with decreasing temperature, unlike 
for most metallic materials. 

As described in more detail in the supplemen- 
tary materials, the CrCoNi and CrMnFeCoNi 
alloys that we investigated were arc melted, 
drop cast, and homogenized at 1200°C before 
being cold worked at room temperature and 
recrystallized at 800°C to give a single-phase 
equiaxed grain structure with an average grain 
size of ~21 um in CrMnFeCoNi and ~8 pm in 
CrCoNi. Their uniaxial tensile stress-strain 
curves and crack-resistance curves (R-curves), 
experimentally measured at 20 K, are shown 
in fig. S4 and Fig. 1, A and B, respectively. For 
comparison, Fig. 1, A and B, also shows R-curves 
taken at ambient (293 K), dry ice (198 K), and 
liquid nitrogen (77 K) temperatures. We back- 
calculated the corresponding stress intensity- 
based fracture toughness values from the J-values 
as a function of temperature (20 to 293 K) and 
plotted Kj, (Fig. 1, C and D), which we deter- 
mined according to ASTM Standard E1820 
(77), and K,,, defined at the maximum limit of 
valid crack extension (17), where Aa = 2.25 mm 
(near the plateau of the R-curve). 

Both alloys show markedly rising R-curves 
that progressively increase with decreasing 
temperature, especially CrCoNi. Exceedingly 
high fracture toughness values are exhib- 
ited at 20 K; the Kj;, and K,, values for the 
CrMnFeCoNi alloy are, respectively, 262 and 
383 MPa-m”, whereas the corresponding val- 
ues for CrCoNi are 459 and 544: MPa-m”. The 
latter value represents one of the highest tough- 
nesses on record. Fracture surfaces at 20 K 
showed no sign of brittle fracture features and 
exhibited 100% ductile failure by microvoid 
coalescence, in agreement with earlier work 
at higher temperatures of 77 to 293 K (0, 15), 
with dimple sizes in the range of several mi- 
crometers (Fig. 2, C and D). 
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Fig. 2. Microstructure and fractography of the CrCoNi-based alloys. EBSD scans show the equiaxed 
single-phase microstructures in (A) CrMnFeCoNi and (B) CrCoNi alloys. The sample direction associated with 
the IPF coloring is the direction normal to the EBSD scan plane. Fracture in both alloys occurs by microvoid 
coalescence. Examples of such ductile fractures in CrCoNi are shown at (C) 293 K and (D) 20 K. 


Discussion 

Despite their exceptionally high fracture tough- 
ness, these alloys do not have complex micro- 
structures, as they are simple single-phase solid 
solutions (Fig. 2, A and B). Thus, an important 
question that immediately arises is the origin 
of this exceptional fracture resistance and 
why it should be so progressively enhanced 
at cryogenic temperatures. 

To address this, we look to the cooperative 
defect behavior responsible for plastic defor- 
mation in these alloys (23-25), using mainly 
the CrCoNi alloy to illustrate the prototypical 
behavior at 20 K versus room temperature. We 
used postfracture EBSD analysis and high- 
resolution transmission electron microscopy 
(HRTEM) of the heavily deformed regions 
within the plastic zone, directly adjacent to 
the crack tip where local strains can read- 
ily be on the order of 60 to 100%. Although 
the microstructure starts off as a rather 
simple single-phase solid solution, deforma- 
tion at 20 K transforms the structure into a 
rich and complex mixture of phases and de- 
fect structures. 

To investigate the microscopic deformation 
mechanisms, the compact-tension samples 
used in the fracture toughness tests were first 
cut through the midthickness to obtain sections 


of the fracture path that were predominantly 
under plane-strain conditions. These were first 
mechanically and then electrolytically polished 
for examination with EBSD. EBSD image qual- 
ity (IQ) and inverse pole figure (IPF) maps 
(Fig. 3) for the CrCoNi alloy tested at 20 K 
indicate extensive deformation-induced twin- 
ning in the highly deformed grains (under 
high stress triaxiality) within the plastic zone 
in the vicinity of the crack tip. Similar defor- 
mation mechanisms have been reported for 
CrCoNi-based alloys at these low temperatures 
in uniaxial tensile tests where the degree of 
triaxiality is far lower (18-22, 25). We then 
made sections from these regions into TEM 
foils using a focused ion beam (FIB) lift-out 
method, finishing with a 5-kV Ga* polish, for 
HRTEM and four-dimensional scanning trans- 
mission electron microscopy (4D-STEM). The 
CrCoNi samples that were deformed and frac- 
tured at both room temperature and at 20 K 
show a strong propensity for planar deforma- 
tion features (Fig. 4). We conducted HRTEM 
imaging to identify the nature of these fea- 
tures. At room temperature, they include both 
nanotwins and bands of stacking faults, but no 
well-defined sequence (exceeding three layers) 
of any hexagonal close-packed (hcp) phases 
(Fig. 4, A and B). We did identify, however, a 
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propagation 
direction 


Fig. 3. EBSD maps. (A) Image quality (IQ) and (B) inverse pole figure (IPF) maps, showing the fracture 
path and accompanying deformation behavior of the CrCoNi alloy at 20 K. For the fracture propagating from 
left to right, predominantly plane-strain sections (taken at the midthickness of compact-tension samples) 
show the microstructure in the heavily deformed region directly ahead of the crack tip (within the plastic 
zone). The sample direction associated with the IPF coloring is the direction normal to the EBSD scan plane. 


considerable number of dislocations in the 
area between the planar features (shown in 
detail in fig. S5). In contrast, for samples tested 
at 20 K, the dominant planar features are de- 
formation bands full of stacking faults (Fig. 4D), 
with the frequent appearance of “laths” of the 
hep phase with a thickness of a few nano- 
meters (Fig. 4H). The latter are notably absent 
in specimens deformed at room temperature. 
Additionally, CrCoNi demonstrates a decreased 
tendency for nanotwinning at 20 K, as well asa 
smaller size of nanotwins compared with those 
formed at room temperature. 

We conducted successive 4D-STEM exper- 
iments to identify the size and distribution of 
these microstructural features in both CrCoNi 
(Fig. 4) and CrMnFeCoNi (fig. S6). Both virtual 
dark-field images and selected-area diffraction 
patterns were reconstructed to extract the spa- 
tially resolved structural information of the 
planar features in the CrCoNi samples tested 
at 293 and 20 K. The planar features gener- 
ated at room temperature are primarily well- 
defined nanotwins or stacking faults, with 
the former having sizes in the range of sev- 
eral nanometers (Fig. 4C). In contrast, the 
planar deformation features identified at 
20 K contain a combination of diffuse yet finer 
nanotwins, stacking faults, and a well-defined 
hep phase (Fig. 4, F and I). We believe that 
this change in deformation modes at 20 K, 
promoted by low stacking-fault energies, is 
primarily responsible for the groundbreak- 
ing fracture toughness. 

Stacking-fault energies for CrMnFeCoNi 
(26) and CrCoNi (27) have been experimentally 
determined as ~30 and 14. mJ-m ~”, respectively, 
at room temperature, which are expected to 
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progressively decrease at lower temperatures. 
Indeed, measurements in CrFeCoNiM0g » re- 
port the stacking-fault energy to decrease from 
28 mJ-m~? at 293 K to 11 mJ-m ® at 15 K (28). 
However, microscopy-based measurements 
obtained from balancing forces on finitely dis- 
sociated dislocations in concentrated alloys 
have drawn criticism for neglecting local var- 
jiations in the Peierls potential due to chemical 
fluctuations (29) and their associated lattice 
distortion (30), as well as grain-size depen- 
dence (37). Nevertheless, the trend of reduced 
stacking-fault energies at lower temperatures 
in these alloys is inarguable and consistent 
with theoretical predictions of the increasing 
energetic stability of the hcp phase relative to 
the fcc phase with decreasing temperature. 
In the CrMnFeCoNi samples tested at 20 K, 
4D-STEM characterization (fig. S6) revealed 
the presence of nanotwins similar to those 
seen in CrCoNi but no evidence of hcp phase 
formation. This likely explains the former’s 
lower strength (fig. S4) and fracture toughness 
(Fig. 1, C and D). 

Although low stacking-fault energies and 
the associated phenomena of planar slip, stack- 
ing faults, nanotwinning, and hep phase for- 
mation have been individually observed in 
simpler fcc alloys, the full sequence described 
above, coupled with strong solid-solution 
strengthening and prolonged strain harden- 
ing, generally has not. Consider, for example, 
dilute Cu-Al alloys (containing up to 10 atomic % 
Al) in which the stacking-fault energy can be 
significantly decreased by the addition of Al 
(to values even lower than those of CrCoNi and 
CrMnFeCoNi). Nevertheless, the Cu-Al binaries 
remain significantly weaker at ~10 K [one-fifth 


to one-tenth the critical resolved shear stress 
of CrMnFeCoNi (26)]. Similarly, one can con- 
template elements that increase solid-solution 
strengthening (e.g., as a result of large atomic 
size misfits) but have little to no effect on the 
stacking-fault energy, and so on for each of 
the mechanisms discussed above. Because 
a given element in an alloy will at most in- 
fluence one or two of the desired mechanisms, 
multiple alloying elements will likely be needed 
to simultaneously or sequentially activate all 
relevant mechanisms. In the long run, this 
ability to control individual mechanisms, pre- 
cisely when needed, may well prove to be the 
single biggest advantage of multiple princi- 
pal element alloys such as those investigated 
here (32). 

The low deformation temperature limits dis- 
location motion and twin growth by suppress- 
ing thermally activated processes. But the 
increased flow stress increases the formation 
of twins and hep phases at 20 K, as compared 
with room temperature, at which no hep for- 
mation was detected. As twinning in these 
alloys has been shown to be stress-controlled 
(33, 34), an fec—hep transformation (which 
involves a similar change in stacking sequence) 
would be expected to be favored as the flow 
stress increases with decreasing temperature. 
A diffuse network of the resulting planar 
deformation features—nanotwin and phase 
interfaces—acts to further decrease the mean 
free path for dislocation motion. Combined 
with suppressed dynamic recovery at these 
low temperatures, a synergy of deformation 
mechanisms—dislocation glide, stacking fault 
formation, deformation nanotwinning, and 
deformation-induced phase transformation— 
is created at increasing strain levels, which 
presents a highly efficient process for develop- 
ing and, most importantly, prolonging strain 
hardening to restrict the localization of de- 
formation in the crack-tip region. In simple 
terms, the strain hardening naturally increases 
strength but at the same time delays the onset 
of necking, which promotes ductility. Micro- 
mechanical models for the prediction of the 
fracture toughness of materials experiencing 
ductile fracture (35, 36) are based on a critical 
strain being exceeded over a characteristic 
microstructural dimension ahead of a crack 
tip; these models give the Jj, toughness to be 
directly proportional to the product of the flow 
strength, strain to failure, and this charac- 
teristic dimension (which is related to some 
multiple of the particle spacing involved in the 
microvoid coalescence fracture). Accordingly, 
the corresponding elevations in strength and 
ductility resulting from the strain hardening, 
created by the prolonged sequence of multiple 
deformation mechanisms, act in concert to 
enhance the toughness. The extent of strain 
hardening increases as temperature decreases 
owing to several related factors: (i) the lower 
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stacking-fault energy, which hinders cross-slip 
by increasing the spacing between Shockley 
partials; (ii) the promotion of nanotwinning 
(also related to lower stacking-fault energy); 
and (iii) the onset of the deformation-induced 
transformation to the hcp phase (but only to 
a limited degree, as too much hep could em- 
brittle the material). 

The role of the fec—hcp transformation is 
particularly interesting. The in situ transfor- 
mation to the hexagonal phase has now been 
reported in numerous papers (37-39). It enables 
further strain hardening at higher strains, but 
the resulting epsilon martensite is not neces- 
sarily beneficial for toughness, as hcp phases 
are generally not as ductile as fec phases, which 
in sufficiently large amounts would likely cause 
a ductile-brittle transition at 20 K. Indeed, 
some researchers have reported a decrease in 
ductility at 4.2 to 20 K, as compared with 77 K, 
in alloys similar to ours, which they attribute 
to epsilon martensite formation (20, 40). 
Intriguingly, 0 K ab initio calculations sug- 
gest that, for a given degree of (dis)order, hcp 
CrCoNi has a lower Gibbs energy than the fcc 
equivalent; this relationship inverts at higher 
temperatures owing to the activation of pho- 
non modes (47) and likely spin fluctuations as 
well (42). This picture is consistent with the 
observation of somewhat larger hcp lamellae 
at 20 K, but it must be asked why, in contrast 
to similarly metastable Co (43) or CoNi alloys 
(44), low-temperature deformation still only 
minimally induces the martensitic hcp phase, 
which has been reported in quantities ranging 
from 0 to a maximum of a few vol % (18, 25, 45). 
This observation is also consistent with our 
in situ neutron diffraction data (fig. S7), where 
the maximum stacking-fault probability caused 
by deformation was estimated to be 6.3 x 10°. 
As noted, the theoretical metastability of CrCoNi 
is of similar magnitude as that of elemental 
Co (43), which demonstrates a clear allotropic 
phase transition via a martensitic mechanism 
that is enhanced by mechanical deformation 
(43). Even in more stable fec CoNi alloys, which 
far less readily transform spontaneously to hcp, 
low-temperature deformation has been found 
to promote sizable hcp regions (44). One par- 
ticularly enticing explanation for the observed 
behavior is the presence of quenched-in chemi- 
cal short-range order stabilizing the fcc matrix 
relative to the formation of less-ordered hcp 
regions (46). This scenario would also address 
the apparent restriction of existing hcp regions 
to thin laths within extended planar defect 
structures (37), where local order would be 
disrupted by the occurrence of earlier slip. 
Recent simulations (47) offer specific insight 
on the mechanisms by which hep and twin 
nuclei freely expand in random, but not in 
short-range ordered, samples at room tem- 
perature. Although the material used in this 
study was not prepared in a manner intended 
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Fig. 4. HRTEM and 4D-STEM characterization of the deformed microstructures in the CrCoNi alloy 
adjacent to the fracture surfaces at 293 and 20 K. Overview of STEM and bright-field images of samples 
tested at 293 K [(A) to (C)] and at 20 K [(D) to (I)]. (A and B) HRTEM images showing representative 
deformation bands with stacking faults and nanotwins, respectively. (C) Virtual dark-field image generated 
from the 4D-STEM scan (stronger intensity is in warmer color). The position of the virtual aperture is marked 


by the red circle in the inset. The inset is a virtual selected-area diffraction pattern generated from the 
region marked in (C), showing the diffraction from the nanotwins. (D) HRTE 


image showing a representative 


deformation band in the sample. As shown in the inset, a high density of stacking faults can be identified 


in the band. (E) A nanotwin identified in the 20 K sample. (F) Virtual dark-field image generated from 


the 4D-STEM scan. The virtual aperture is set to pick up the signal from twins and stacking faults, as marked 
by the red circle in the inset. (G) Bright-field image of deformation bands. (H) HRTEM image taken from a 
[110] orientation at 20 K. Multiple nanometer-sized bands with hcp sequence can be identified. The inset 


shows the fast Fourier transform image from one of the hcp bands, representing a [1120] orientation. 


(I) Virtual dark-field image generated from the 4D-STEM scan. The virtual aperture is set to pick up the signal 
from the hcp phase, as marked by the red circle in the inset. The insets in (F) and (I) are virtual diffraction 
patterns from the twin and stacking faults and the hcp phase, respectively. 


to promote local ordering (48), recent mea- 
surements such as those of Oh et al. (49) have 
suggested that elusive atomic-scale chemi- 
cal ordering could be present in even water- 
quenched samples. In this case, the physics 
of deformation would be similar to that dis- 
cussed by Yu e¢ al. (47), with a greater driving 
force for hcp formation at lower temperatures. 
Whatever the reason, because the hcp laths 
are relatively thin, they can provide a dispro- 
portionally large number of barriers to dislo- 
cation motion (and, in turn, work hardening), 
even when the overall volume fraction is small, 


as in the case of nanotwins (33), which con- 
tribute little to axial strain (tensile ductility) 
but significantly to strain hardening. For ex- 
ample, a single hcp lath with finite thickness 
introduces two new barriers within a grain, 
thereby splitting the grain into two or three 
segments (depending on lath thickness). For 
the sake of simplicity, if the hcp lath is assumed 
to be very thin and its boundaries are assumed 
to have the same strength as grain boundaries, 
the effective grain size is halved when one lath 
is introduced in each grain, quartered when 
three laths are introduced per grain, and so on. 
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Fig. 5. Ashby map in terms of the fracture toughness, K,, versus the yield strength, oy, for a broad class 
of materials. Note the exceptional fracture toughness of the CrCoNi-based medium- and high-entropy alloys, 


which appear to be the highest on record. (The toughn 
than the CrCoNi-based alloys were measured at ambien 
PET, polyethylene terephthalate; PP, polypropylene; PS, 


This would induce substantial strengthening 
by a dynamic Hall-Petch-type mechanism in 
the strain-hardening regime because the phase 
transformation occurs dynamically during 
straining. As is the case with nanotwins in a 
polycrystalline HEA (33), the main role of the 
hep phase is to enhance strain hardening, 
which leads indirectly to increased ductility 
by postponing necking instability, rather than 
directly contributing to the tensile strain. De- 
spite the hcp phase being brittle, this never- 
theless serves to enhance the toughness, but 
with the key constraint that its volume frac- 
tion must be small. 

The results we present here on the CrCoNi- 
based MEAs and HEAs show that a monotonic 
improvement in fracture toughness is possible 
down to very low temperatures (20 K). This is 
rarely the case, especially in bec and hep alloys 
that undergo a ductile-brittle transition as the 
temperature decreases. Even many fcc alloys 
show a significant drop in toughness below a 
critical temperature. For example, the Charpy 
toughness of 304L stainless steel starts drop- 
ping around 223 K and by 77 K is ~35 to 67% 
lower, depending on the heat treatment; 316L 
stainless steel shows a similar drop in tough- 
ness (50). Many Al alloys, whose Charpy tough- 


ess results contained in this figure for materials other 
it temperatures.) PC, polycarbonate; PE, polyethylene; 
polystyrene; PTFE, polytetrafluoroethylene. 


noticeable toughness drop below ~200 K (57). 
The same can be said about Cu-Be alloys, certain 
bronzes, and hardenable stainless steels. Thus, 
as cryogenic structural materials, equiatomic, 
single-phase fec CrCoNi-based medium- and 
high-entropy alloys, in particular the CrCoNi 
alloy, appear to be unique. They possess some 
of the most impressive mechanical properties 
of any metallic alloy reported to date. Indeed, 
their crack-initiation and crack-growth frac- 
ture toughness values at 20 K are among the 
highest ever recorded (Fig. 5), a fact that we 
ascribe to their effective strain-hardening ca- 
pacity generated by their synergy of deforma- 
tion mechanisms created under increasing 
strains, including dislocation slip, stacking- 
fault formation, deformation nanotwinning, 
and limited hep epsilon martensite formation. 

A broader impact of this work is that the 
sequence of mechanisms outlined here can, in 
principle, be put to work in other alloy sys- 
tems, from around room temperature all the 
way down to 20 K, depending on the temper- 
ature regime of interest for a given applica- 
tion. At the low end of temperatures (~20 K), 
potential applications include the long-distance 
transportation of liquid hydrogen. With in- 
creased emphasis on climate change and the 


ness values are not high to begin with, exhibit a 
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need for clean energy, hydrogen might par- 


tially replace fossil fuels, especially if the former 
can be produced by electrolysis of water using 
renewable energy. Load-bearing materials that 
need to operate in the frigid temperatures of 
distant planets can also make use of the insights 
gained here. At somewhat higher tempera- 
tures (~110 to 115 K), transportation of liquified 
natural gas across oceans becomes an important 
potential application because pipelines cannot 
easily reach everywhere; the present findings 
can guide the design of damage-tolerant mate- 
rials for such applications. 


Conclusions 


The main takeaway here is the realization that 
multiple strain-hardening mechanisms need 
to be activated, in exactly the right sequence, 
for simultaneous increases in strength, ductil- 
ity, and toughness. This basic insight applies 
across the entire range of cryogenic temper- 
atures, and HEAs and MEAs are uniquely 
qualified to make practical use of this because 
they possess the multiple compositional “dials” 
needed to tune each individual mechanism 
separately without adversely affecting the 
others (32)—this can be extremely difficult (if 
not impossible) in conventional alloys com- 
prising just one or two principal elements. 
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Active DNA demethylation promotes cell fate 
specification and the DNA damage response 


Dongpeng Wang"t, Wei Wu'?*+, Elsa Callen';, Raphael Pavani', Nicholas Zolnerowich’, 
Srikanth Kodali*“, Dali Zong’, Nancy Wong?, Santiago Noriega’, William J. Nathan’, 
Gabriel Matos-Rodrigues?, Raj Chari°, Michael J. Kruhlak®, Ferenc Livak’, Michael Ward’, Keith Caldecott®, 


Bruno Di Stefano**, André Nussenzweig™* 


Neurons harbor high levels of single-strand DNA breaks (SSBs) that are targeted to neuronal enhancers, 
but the source of this endogenous damage remains unclear. Using two systems of postmitotic lineage 
specification—induced pluripotent stem cell-derived neurons and transdifferentiated macrophages— 

we show that thymidine DNA glycosylase (TDG)-driven excision of methylcytosines oxidized with 
ten-eleven translocation enzymes (TET) is a source of SSBs. Although macrophage differentiation favors 
short-patch base excision repair to fill in single-nucleotide gaps, neurons also frequently use the long- 
patch subpathway. Disrupting this gap-filling process using anti-neoplastic cytosine analogs triggers 

a DNA damage response and neuronal cell death, which is dependent on TDG. Thus, TET-mediated active 
DNA demethylation promotes endogenous DNA damage, a process that normally safeguards cell identity 
but can also provoke neurotoxicity after anticancer treatments. 


efects in DNA repair result in cancer 

predisposition as well as neurological 

diseases. Although all cell types incur 

DNA damage and mutations, neurons 

are exceptionally vulnerable to defects in 
single-strand break (SSB) repair (7). SSBs are 
detected by XRCC1 and PARP (poly-ADP ribose 
polymerase), which recruit factors involved in 
DNA end modification (PNKP, polynucleotide 
kinase 3’-phosphatase; APTX, aprataxin), DNA 
gap filling (POLB, DNA polymerase f family), 
and ligation (DNA ligase 1 or 3) (2). These pro- 
teins are essential for most SSB repair events, 
usually comprising “short-patch” reactions in 
which only a single missing nucleotide is re- 
placed. More rarely, a long-patch subpathway 
is employed, which uses extended DNA synthe- 
sis before ligation. However, long-patch SSB 
repair cannot completely compensate for loss 
of short-patch SSB repair, as evidenced by the 
association of neurological diseases with muta- 
tions in genes required for short-patch repair, 
like XRCC1, APTX and PNKP (2). 
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Recently, we and others demonstrated that 
highly active long-patch, synthesis-associated 
repair (SAR) of SSBs in neurons can be detected 
by incorporation of 5-ethynyl-2'-deoxyuridine 
(EdU) (SAR-seq) (3, 4). DNA repair synthesis 
and SSBs were localized to neuronal enhanc- 
ers, corresponding to 2% of the genome (3, 4). 
Why neurons concentrate the repair machin- 
ery to these hotspots is an unsolved question; 
moreover, the source of endogenous SSBs and 
their physiological relevance remain unclear. 


Active DNA demethylation generates SSBs 
and ADP-ribosylation in neurons 


SAR-seq peaks in neurons correlate with oxi- 
dized forms of 5-methylcytosine, suggesting the 
potential involvement of active DNA demeth- 
ylation via ten-eleven translocation (TET) en- 
zymes (4) (fig. SIA). In induced pluripotent 
stem cell (iPSC)-derived neurons (iNs), all TET 
family enzymes (TET1, TET2, and TET3) are 
expressed (fig. SIB). During passive demethyla- 
tion, TET-mediated oxidized methylcytidines 
are lost during successive rounds of replication. 
During active DNA demethylation, the thymidine 
DNA glycosylase (TDG) excises TET-mediated 
oxidized methylcytidines 5fC and 5caC to pro- 
duce SSBs (5, 6). SSBs can be detected by 
S1-END-seq using chain-terminating dide- 
oxynucleosides (ddN) (Fig. 1A) (4). Given the 
observed accumulation of SSBs near oxidized 
methylcytosines (4), we set out to test whether 
active DNA demethylation is a source of endog- 
enous DNA damage in postmitotic neurons. 
Initially, we used CRISPR interference 
(CRISPRi) to deplete TDG in human iPSCs 
(fig. SIC), after which they were differentiated 
into neurons (iNs) (7, 8). TDG-deficient, but 
not wild-type (WT) iNs, accumulated high 
levels of 5fC/caC (fig. SID). We previously 
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Fig. 1. Single-strand breaks at neuronal enhancers are TDG dependent. 

(A) Schematic overview of SSB generation during SP- and LP-BER and SSB 
detection by ddC S1-END-seq. (B) Heatmaps of S1-END-seq peaks in iN 

upon overnight treatment with different dideoxynucleoside (ddA, ddC, ddT, or 
ddG), plotted 1kb on either side of SAR-seq peak summits, ordered by SAR-seq 
intensity. (©) Genome browser screenshot showing single-strand breaks 

(ddC S1-END-seq) detected in TDG°"" iN treated with or without dTAG. 


found that the mixture of ddNs led to robust 
accumulation of SSBs detected by S1-END-seq 
in iNs(4). By treating cells with each ddN sep- 
arately, we found that only ddC, but not ddA, 
ddT, or ddG, produced single-strand DNA gaps 
(Fig. 1B). We therefore tested whether the for- 
mation of SSBs was dependent on TDG and 
found that, although SSBs accumulated at en- 
hancers in WT iNs, TDG-depleted iNs har- 
bored far fewer lesions (fig. S1, E and F). 

Because TET-mediated active DNA demeth- 
ylation may be important for transcriptional 
changes during neuronal differentiation, we 
inactivated TDG after cells had already dif- 
ferentiated into iNs. To do this, we used the 
degradation tag (dTAG) system, wherein a 
FKBP12 tag was knocked into the N terminus 
of endogenous TDG in iPSCs (fig. S1G). After 
6 days of differentiation, TDG was acutely 
degraded in neurons upon addition of dTAG 
(fig. S1, H and I). On day 7, we performed ddC 
S1-END-seq. Consistent with the CRISPRi re- 
sults, we found that TDG was required for SSB 
formation (Fig. 1, C and D). 

A high level of PARP activity was found in 
iNs (Fig. 1E) (4). To test whether this is related 
to active DNA demethylation, we measured 
PARP activity when TDG was acutely degraded 
with dTAG. Indeed, ADP-ribosylation decreased 
on average by twofold in neurons lacking TDG 
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dTAG. ddC S1-END-seq is 


and quantification (right) 


test (****p < 0.0001). 


(Fig. 1E). Thus, TET-mediated active DNA de- 
methylation is the source of SSBs at neuronal 
enhancers, although other lesions could also 
contribute to ADP-ribosylation. 


TET2 is required for pre-B to macrophage 
differentiation 


In contrast to neurons, recurrent DNA repair 
peaks were not observed in other postmitotic 
cells, including Go-arrested pre-B cells and 
iPSC-derived skeletal muscle cells (4). To un- 
derstand whether overall DNA damage and 
repair is exclusive to neurons, we examined 
postmitotic macrophages derived by tran- 
scription factor-induced transdifferentiation 
from pre-B cells (Fig. 2A) (9). C/EBPa turns on 
the myeloid program in pre-B cells efficiently 
(>95%) and rapidly (48 hours) (Fig. 2B) (9, 10). 
Cell conversion requires up-regulation of the 
C/EBPa target PU.1, which drives macrophage- 
specific gene expression and down-regulation 
of the B cell-specific transcription factor PAX5 
to shut down the B cell transcriptional pro- 
gram (11). Previous studies demonstrated that 
TET2 knockdown partially delayed C/EBPa- 
induced macrophage (iM) differentiation be- 
cause of impaired up-regulation of myeloid 
genes (12). To confirm the involvement of 
TET2 in pre-B to macrophage cell differenti- 
ation, we knocked out TET2 in pre-B cells by 
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(D) Heatmaps of ddC S1-END-seq in TDG%e"°" iN treated with or without 


ordered by SAR-seq intensity (4) and plotted 1 kb on 


either side of SAR-seq peak summits in iN. (E) Representative images (left) 


of iN with immunofluorescence staining for DAPI 


(blue) and PARylation (green) in TDG*°2"°" iN treated with or without dTAG. 
Scale bar, 10 um. Statistical significance was determined using Mann-Whitney 


CRISPR-Cas9 targeting (fig. S2A). Consistent 
with previous findings, the macrophage cell 
surface marker MAC-1 was induced in more 
than 95% of control cells 2 days after C/EBPa. 
stimulation, whereas only 4% of TET2-deficient 
cells expressed MAC-1 (fig. S2B). Thus, TET2 is 
required for pre-B to iM conversion. 


Active DNA demethylation triggers SP-BER 
during macrophage trans-differentiation 


To test whether active DNA demethylation op- 
erates during macrophage cell conversion, we 
knocked out TDG in pre-B cells (fig. S2C). In 
contrast to the severe differentiation block in 
TET2”~ cells (fig. S2B), most TDG cells 
down-regulated B-cell surface marker CD19 
and expressed MAC-1 by 2 days post-C/EBPa 
induction (Fig. 2B). Thus, distinct from TET2, 
TDG is not required for the generation of 
macrophage-like cells. Nevertheless, we de- 
tected an accumulation of 5fC/caC in TDG’~ 
but not WT iM (fig. S2D). Thus, TDG excises 
oxidized methylcytosines during the transdif- 
ferentiation of pre-B to macrophage cells. 

To localize active DNA demethylation and 
associated repair sites, cells were treated with 
aphidicolin (APH) for 4 hours after 1 day of C/ 
EBPa induction to inhibit any residual cell divi- 
sion without affecting differentiation (fig. S2E). 
Cells were then labeled with EdU for 20 hours 
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Fig. 2. TDG-dependent SSBs at macrophage enhancers. (A) Schematic 
overview of pre-B to iM transdifferentiation. (B) FACS plot (left) and quantification 
(right) of MAC-1 and CD19 expression in C/EBPa-ER-infected wild type and 
TDG pre-B cells before induction and 2 days after induction with B-estradiol. 
Statistical significance was determined using unpaired t test (mean + SD). 

(C) Genome browser screenshots illustrating SAR-seq and ddC S1-END-seq 

in iM. For SAR-seq, cells were either not treated (NT) or treated with PARPi on 
day 1 and harvested on day 2. (D) Bar graph comparing fold increase of 
SAR-seq peak numbers upon PARPi versus nontreated iM or nontreated iN. 
Also plotted is the fold-increase in iN expressing POLB-targeted (sgPOLB) or 


+1-1 +1-1 +1-1 
as from aaa (kb) 


XRCC1-targeted (SgXRCC1) CRISPRi plasmids versus nontargeted control. 

(E) Heatmaps illustrating the correlation in iM between SAR-seq, ddC S1-END-seq, 
enhancer markers (ChiP-seq for H3K4mel, H3K27ac, and PU.1), and 5hmC 
(5hmC Seal). (F) Genome browser screenshots showing SAR-seq (with PARPi), 
ddC S1-ENDsegq, transcription factor binding (C/EBPo and PU.1), and enhancer 
markers (ChIP-seq for H3K4mel, H3K27ac) in iM at the KIf4 de novo enhancer. 
“B" represents pre-B cells and “iM@” represents induced iM. (G) Genome 
browser screenshot showing SAR-seq (with PARPi) and ddC S1-END-seq in WT 
and TDG knockout iM. (H) Heatmaps showing SAR-seq (with PARPi) and 

ddC S1-END-seq in WT and TDG-deficient iM. 


and processed for SAR-seq (Fig. 2C) (4). Dis- 
tinct from iNs, which exhibited ~55,000 hot- 
spots of DNA repair (4), almost no peaks were 
detectable in iMs (Fig. 2C), reminiscent of our 
findings in other postmitotic cells (4). One 
potential explanation is that replication- 
independent active DNA demethylation gen- 
erates only few SSBs in iMs, making a minor 
contribution to differentiation, as has been 


SCIENCE science.org 


shown in other cell types (73). Alternatively, 
TDG-mediated excision of 5fC/caC might occur 
at a high frequency, but these bases would 
be replaced by unmodified cytosine via short 
patch (SP) base excision repair (BER). SP-BER 
events triggered by TDG excision of oxidized 
methylcytosines are undetectable by SAR-seq 
because this assay measures incorporated EdU, 


Depletion or inhibition of factors involved 
in SP-BER, including XRCC1, PARP1, and POLS, 
leads to an approximately twofold increase in 
the number of SAR-seq peaks in iNs (Fig. 2D), 
suggesting that long patch (LP)-BER is the 
primary source of the SAR-seq signal in these 
cells (4). Because iMs lack robust SAR-seq sig- 
nals, we asked whether SP-BER participates in 


a thymidine analog. 


active cytosine demethylation. To investigate 
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Fig. 3. Active DNA demethylation contributes to iM cell identity. 

(A) Imaging flow cytometric analysis of phagocytosis in iM. Scale bar, 7 um. 
BF (bright field), green fluorescent protein (GFP) (C/EBPa expression), 
MAC-1 (MAC-1 expression, red), E. coli (dsRed-E. coli, yellow). (B) Bar graph of 
percentages of MAC-1-positive WT and TDG” iMs that internalized dsRed-E. 
coli. Statistical significance was determined using unpaired t test (****p < 
0.0001). (C) Bar graph and median intensity of internalized dsRed-E. coli 

in MAC-1-positive cells. Statistical significance was determined using unpaired 


this question, we first inactivated SP-BER via 
PARP inhibition (PARPi) and found a 60-fold 
increase in the number of SAR-seq peaks in 
iMs from 1951 to 119,397 (Fig. 2, C and D). To 
further evaluate the function of SP-BER in 
iMs, we treated cells with chain-terminating 
ddN followed by S1-END-seq. As in iNs, SSBs 
were detectable only when iMs were treated 
with ddC alone, but not with ddA, ddT, or ddG 
(Fig. 2C and fig. S2F). Overall, 24,000 SSB peaks 
were detectable in iM, which is similar to the 
number of S1-END-seq peaks found in iN 
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genes are indicated. 


(28,000). Thus, SSBs accumulate in both cell 
types, but whereas LP-BER contributes to about 
half of SSB repair in iNs, SP-BER is overwhelm- 
ingly dominant in iMs. 


SSBs form predominantly at de novo 
enhancers and are TDG dependent 


Like iNs (3, 4), LP-BER sites in iMs, measured 
after PARPi treatment, colocalized and corre- 
lated with enhancers marked with H3K4mel1 
and H3K27ac (Fig. 2E). Analysis of transcrip- 
tion factor binding motifs revealed an over- 


t test (****p < 0.0001). (D) Linear plots showing expression changes of 

up- or down-regulated genes on average in WT and TDG” cells. (E) Left, 
genome browser screenshot showing SAR-seq, ddC S1-END-seq, and enhancer 
activity (H3K27ac ChIP-seq) in WT and TDG” iM at the Csflr enhancer. 
Right, expression of Csflr in WT and TDG” iM measured by RNA-seq. 

(F) Heatmap of gene expression associated with macrophage activation in WT 
and TDG” cells before and during transdifferentiation. Tlr (toll-like receptor) 


representation of PU.1 binding sites centered 
at SAR-seq summits (fig. S2G). ChIP-seq anal- 
ysis of PU.1 binding confirmed colocalization 
with sites of SSBs, DNA repair synthesis, and 
5hmC (Fig. 2E). Notably, PU.1 and C/EBPa in- 
teract with TET2, PU.1- and C/EBPa-targeted 
enhancers undergo TET2-mediated demethyla- 
tion, and PU.1 is required for myeloid differen- 
tiation (14-16). 

Myeloid enhancers have been described as 
“preexisting” or “de novo” depending on the 
order of recruitment of PU.1 and C/EBPa (6). 
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Fig. 4. Anti-C metabolites enhance repair synthesis and trigger the 

DNA damage response. (A) Heatmaps illustrating increased SAR-seq upon 
ddC treatment in iN and iM. Top, aggregate plots of SAR-seq intensity. 

(B) Heatmaps illustrating increased SAR-seq upon Ara-C and gemcitabine 
treatment of iN. Top, aggregate plots of SAR-seq intensity. (©) Gene Ontology 
analysis showing the enrichment of p53 target genes differentially expressed 
(|fold change] > 2) upon ddC, Ara-C, and Ara-A treatment in iN. The x axis 
represents the enrichment value as the logarithm of false discovery rate 
(FDR). (D) Survival of iN treated with Ara-C, dTAG alone, or Ara-C upon 
TDG depletion (Ara-C+dTAG). Data represent viability relative to nontreated 
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Preexisting enhancers are prebound by PU.1 
or C/EBPa and already active in pre-B cells, 
whereas de novo enhancers are first bound by 
C/EBPa before PU.1 and gradually become 
active during macrophage transdifferentiation 
(16). Although SAR-seq and S1-END-seq peaks 
were found at both types of enhancers, sites 
of DNA repair were more highly enriched for 
de novo macrophage enhancers (figs. $2, H 
to J). Examples of de novo enhancers that 
become demethylated and activated upon 
C/EBPo binding include the enhancers of 
Klf4 and Lefty2 (Fig. 2F and fig. S2K) (75). 
These de novo enhancers harbored peaks of 
SSBs and DNA repair synthesis that colocal- 
ized with C/EBPa and PU.1 (Fig. 2F and fig. 
S2K). We conclude that PU.1 guides TET2 to 
de novo macrophage enhancers, which pro- 
motes active DNA demethylation, SSB forma- 
tion, and SP-BER. 
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To determine whether SSB formation and 
repair at macrophage enhancers is TDG depen- 
dent, we performed ddC S1-END-seq and SAR- 
seq in WT versus TDG cells in the presence 
of PARPi. These assays revealed that TDG was 
essential for both SSB formation and DNA 
synthesis-associated repair in iM (Fig. 2, G and 
H). Thus, TET-mediated active DNA demeth- 
ylation is the source of SSBs at both neuronal 
and macrophage enhancers. 


Active DNA demethylation contributes 
to cell identity 


Absence of TDG-mediated excision of 5fC/caC 
did not prevent the generation of macrophage- 
like cells (Fig. 2B). Yet, thousands of SSBs were 
generated (Fig. 2, C and E), and thousands of 
genes are up- and down-regulated during nor- 
mal differentiation (12). This raises the ques- 
tion of the physiological role of active DNA 


demethylation. To test the functionality of 
TDG’ macrophages, we measured the phago- 
cytic activity of C/EBPa-induced cells by incu- 
bating cells with prelabeled dsRed fluorescent 
protein-expressing Escherichia coli prior to 
analysis by image cytometry (Fig. 3A). Whereas 
90% of control MAC-1" cells ingested E. coli after 
48 hours following incubation, the phagocytic 
capacity of MAC-1*T: DG” cells was reduced 
to 36% (Fig. 3B). Moreover, there was a 23-fold 
reduction in the amount of total E. coli ingested 
per mutant cell (Fig. 3C). Thus, TDG loss com- 
promises the ability to phagocytose bacteria in 
transdifferentiated cells. 

To determine the role of active cytosine de- 
methylation on enhancer activity and gene 
expression, we performed RNA sequencing 
(RNA-seq) on WT versus T: DG” cells before 
(0 hours) and during (24, 48, and 72 hours) 
macrophage differentiation (Fig. 3D). In WT 


2 DECEMBER 2022 + VOL 378 ISSUE 6623 987 


RESEARCH | RESEARCH ARTICLES 


cells, most down-regulated genes were asso- 
ciated with biological processes involving cell 
cycle and DNA repair, likely caused by the rapid 
cell cycle arrest induced by C/EBPa (fig. S3A, 
left panel) (10). Gene Ontology analysis revealed 
up-regulation of genes relating to macrophage 
function (fig. S3A, right panel). Although simi- 
lar gene sets were down-regulated in WT and 
TDG’ cells, 71 DG~ cells failed to effectively 
up-regulate many genes related to macrophage 
differentiation (Fig. 3D). 

We then sorted up-regulated genes into those 
that were either close to or far from DNA repair 
(SAR) loci. Because the majority (70%) of up- 
regulated genes are localized within 100 kb 
of stable C/EBPo-binding sites (15, 16), we used 
100 kb as the cut-off within which genes are 
considered neighboring enhancers. We ob- 
served that TDG” cells exhibited a defect in 
the up-regulation of genes that were located 
close to enhancers active for DNA repair (fig. 
S3B). Genes that were up-regulated in a TDG- 
dependent manner included those that are 
critical for macrophage identity and function 
(fig. S3C) such as Csflr, a marker of cells of 
mononuclear phagocyte lineage. The highly 
conserved super-enhancer (fms-intronic reg- 
ulatory element, FIRE) that controls Csflr ex- 
pression (17) showed robust SAR-seq and ddC 
S1-END-seq peaks (Fig. 3E). DNA repair, SSB 
formation, and enhancer activity, as well as 
Csfir gene expression, were impaired in the 
absence of TDG (Fig. 3E). 

Because MAC-I*TDG~~ cells showed a defect 
in phagocytosis, we focused on genes known to 
be critical for macrophage activation (Fig. 3F). 
Most of these genes were up-regulated in dif- 
ferentiating WT cells but failed to increase in 
TDG” cells (Fig. 3F). Notable among them 
were TLR4 (Fig. 3F) and CD14 (fig S3C), which 
encode for the innate immune receptor com- 
plex that recognizes the lipopolysaccharide 
(LPS) cell wall component of Gram-negative 
bacteria including E. coli. Thus, active DNA 
demethylation of lineage-specific enhancers 
is required for proper up-regulation of genes 
critical for terminal differentiation and activa- 
tion of macrophages. 


In vivo long-patch tract lengths 


TET activity is targeted to methylated CG di- 
nucleotides within enhancers (5, 6). Although 
ddN (and ddC) S1-END-seq demonstrated the 
expected prevalence of cytosines at SI-END- 
seq summits, composite motif analysis failed 
to reveal CG dinucleotides at ddC S1-END-seq 
peak summits (4). The lack of a CG motif could 
result from ddC incorporation during LP- 
BER distal from the initiating methylated CG 
dinucleotide (Fig. 1A and fig. S4A). 

To test this idea, we developed a genome- 
wide base resolution assay to map 5fC and 
5caC residues excised by TDG (fig. S4A). In 
this method, termed oxEND-seq, 5fC/caC sites 
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are cleaved into double-strand breaks (DSBs) 
and then detected by END-seq. DNA is first 
treated with pyridine borane (PB) which con- 
verts 5fC and 5caC into dihydrouracil. USER 
enzyme, a mixture of uracil DNA glycosylase 
(which excises the uracil base) and endonu- 
clease VIII (which breaks the phosphodiester 
backbone with lyase activity), generates a 
single-nucleotide gap. S1 nuclease will then 
cleave these 5fC:G- or 5caC:G-formed gaps to 
generate a DSB (fig. S4A). We applied this 
technique in TDG-knockout iMs, in which 
5fC/5caC bases accumulate but SSBs do not 
form (fig. S2D). OXEND-seq revealed peaks 
that frequently overlapped with sites of SAR- 
seq and ddC S1-END-seq (fig. S4B). Motif anal- 
ysis confirmed the expected pattern of CG 
dinucleotides at oxEND-seq peak summits 
(fig. S4C). We then calculated the relative dis- 
tance of 5fC/caC to the nearest ddC S1-END-seq 
summit (fig. S4D). We observed a distribution 
of ddC S1-END-seq summits distal from the 
initiating CG dinucleotide. Approximately 17% 
of SSBs were located precisely at the CG di- 
nucleotide, which reflect S1-END-seq detec- 
tion of SP-BER events (Fig. 1A and fig. S4D). 
However, 67% were found within 30 residues 
from the TDG” oxEND-seq summits, corre- 
sponding to long-patch repair (Fig. 1A and fig. 
S4D). Thus, tract lengths vary considerably 
in vivo, but most are within 30 base pairs (bp) 
of the initiating lesion. 


Chain termination with cytosine analogs 
triggers LP-BER and the p53 response 


Incorporation of ddC at macrophage and neu- 
ronal enhancers results in a single-strand DNA 
(ssDNA) gap because a phosphodiester bond 
cannot be formed between the missing 3’ OH 
group and the next nucleotide. We found that, 
like PARPi [which triggers LP-BER (4)], ddC 
treatment increased DNA synthesis-associated 
repair in both iN and iM (Fig. 4A). Moreover, 
PARPi and ddC SAR-seq peaks overlapped 
(fig. S5A). This suggests the possibility that 
chain termination provokes LP-SSB repair (fig. 
S5B). Consistent with this, we found that other 
cytosine analogs, such as arabinosylcytosine 
(Ara-C) and gemcitabine (GEM), similarly in- 
creased the intensity of SAR-seq peaks at en- 
hancer sites (Fig. 4B). By contrast, adenine 
analogs, Ara-A and ddA, did not result in any 
change in SAR-seq (fig. S5, C and D). We spec- 
ulate that when antimetabolites (ddC, Ara-C 
or GEM) are incorporated into DNA during 
SP-BER, unligated gaps are detected, leading 
to nucleoside analog excision, and processing 
by long-patch repair, thereby explaining the ob- 
served increase in DNA synthesis-associated 
repair at enhancers (fig. S5B). 
Antimetabolites inhibit replication in mitotic 
cells and are thereby frequently used to treat 
cancer. Cancer treatments are also commonly 
associated with neurotoxicity through un- 


known mechanisms (18). Ara-C-induced cell 
death is dependent on p53 (9, 20), suggesting 
that Ara-C may kill neurons by a DNA damage- 
activated p53-dependent pathway. Consistently, 
we observed a robust p53 transcriptional re- 
sponse when neurons were treated with ddC 
and Ara-C, but not when treated with Ara-A 
(Fig. 4C). Moreover, 6 days after Ara-C treat- 
ment, almost all neurons in the culture had 
died (Fig. 4D). Acute degradation of TDG pre- 
vented Ara-C-induced cell death (Fig. 4D). This 
suggests that antimetabolite-induced neuro- 
toxicity is linked to TET-initiated active DNA 
demethylation. 


Incorporation of nucleoside analogs triggers 
the DNA damage response 


Arecent study demonstrated that Ara-C trig- 
gers histone H2AX phosphorylation in pri- 
mary hippocampal neurons (27). Terminal 
deoxynucleotidyl transferase dUTP nick end 
labeling (TUNEL) did not reveal DSBs in the 
nuclei of y-H2AX-positive neurons (27). We 
also observed y-H2AX and 53BP!1 foci forma- 
tion in almost 100% of iNs after Ara-C or ddC 
treatment (Fig. 4E and fig. S5E). ddC treatment 
led to only a small increase in DSBs revealed 
by END-seq signal, but a marked increase in 
S1-END-seq (fig. S5F), indicating a greater 
number of SSBs than DSBs. In line with ssDNA 
damage, y-H2AX formation was largely ATR 
dependent (fig. S5G). Moreover, ddC-induced 
S1-END-seq and y-H2AX formation required 
POLS (fig S5H), likely because POLB is highly 
selective for ddN (22). Finally, we observed 
that y-H2AX/53BP1 formation was entirely 
dependent on TDG (Fig. 4E and fig. S5E). The 
low frequency of DSBs relative to SSBs is 
consistent with the finding that DSBs rarely 
arise from symmetrically methylated CGs be- 
cause of the highly coordinated and sequential 
action of TET-TDG-mediated base excision 
repair (23). 

On average, six and eight y-H2AX and 53BP1 
foci were detectable after overnight treatment 
with ddC and Ara-C, respectively (Fig. 4E). Foci 
appeared within 2 hours of ddC and Ara-C 
treatment and were detectable for at least 
16 hours after drug withdrawal (fig. S51). This 
suggests that DNA damage either persists or 
is continually generated. If DNA lesions are 
continually produced by TET-mediated active 
DNA demethylation but subsequently repaired, 
acute degradation of TDG after drug with- 
drawal would lead to the disappearance of 
ddC and Ara-C-induced foci. Indeed, DNA 
damage no longer persisted when TDG was 
eliminated after the drugs were withdrawn 
(fig. S5J). 

To directly monitor TET activity and DNA 
repair in living cells, we expressed an mCherry- 
53BP1 reporter (24) in neurons. Upon ddC treat- 
ment, we observed that 53BP1 foci appeared, 
dissolved, and then reappeared throughout the 
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15-hour time course (movie S1). By tracking 
individual foci from when they first appeared 
to when they disappeared, we estimate that 
most ddC-induced DNA damage events are 
resolved within 1 to 2 hours (fig. S5K). 


Active DNA demethylation in fully 
differentiated iNs 


Induced degradation of TDG on day 7 after iN 
differentiation led to the disappearance of SSBs, 
demonstrating the existence of ongoing TET- 
mediated oxidation. Within 14 to 28 days of 
maturation, iNs display differentiated neuron 
markers, action potential firing, and sponta- 
neous synaptic currents, suggesting that they 
are functional excitatory glutamine-releasing 
neurons (8). Consistent with these findings, 
the analysis of gene expression profiles during 
iN differentiation revealed that the transcrip- 
tome continued to change beyond day 7, but 
fewer changes were detected between days 17 
and 30 (fig. S6A). Reminiscent of our finding in 
iMs, genes were similarly down-regulated in 
WT and TDG” iN throughout their maturation 
(fig. S6B). However, TDG knockouts partially 
impaired the up-regulation of genes that were 
induced after day 7 (fig. S6B), including those 
regulating presynaptic signaling (fig. S6 C). 
Even on day 30, we observed robust genera- 
tion of ddC-induced y-H2AX and 53BP1 for- 
mation in the vast majority (>95%) of iNs, 
which was abolished by acute degradation of 
TDG (Fig. 4F). Thus, TDG-mediated SSBs are 
generated in both differentiating and fully 
differentiated iNs. 


Discussion 


TET is highly active in postmitotic neurons 
(25, 26) and to a lesser degree in other so- 
matic cell types (6). An unsolved question is 
the extent to which replication-independent 
demethylation contributes to cell differentia- 
tion and function. We have shown that TDG- 
dependent SSB intermediates accumulate at 
high levels at lineage-specifying enhancers in 
iPSC-derived neurons and transdifferentiated 
macrophages. Moreover, TDG contributes to 
transcriptome reprogramming in both differ- 
entiation systems. There are other examples 
of DNA demethylation in postmitotic cells in 
which gene expression is sensitive to loss of 
TDG. Axonal injury in retinal ganglion neu- 
rons (27) and dorsal root ganglion neurons 
(28) increases TET- and TDG-dependent active 
DNA demethylation to induce regeneration- 
associated gene expression. However, in other 
cases, TDG does not seem to contribute to 
transcriptional changes downstream of TET- 
mediated oxidation. For example, LPS triggers 
cell cycle exit and 5hmC accumulation in bone- 
marrow derived macrophages (BMDM) (13). 
The induced transcriptional program is inde- 
pendent of TDG even though 5fC/5caC accumu- 
lates at the top latent enhancers (Batf, Mdfic, 
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Ib and 116) that acquire H3K27ac after stimu- 
lation (13). Consistent with this, we observed 
that LPS induces SSBs at these latent enhanc- 
ers in BMDM (fig S7). TDG is also dispens- 
able for the active DNA demethylation of 5mC 
in mouse zygotes (29). Removal of oxidized 
methylcytosines could potentially be mediated 
by other BER glycosylases. Alternatively, path- 
ways that do not generate DNA breaks might 
mediate active DNA demethylation, including 
direct dehydroxymethylation of 5hmC (30), de- 
carboxylation of 5caC (37), or deformylation of 
5fC (32). By using high-resolution techniques 
to trace SSB intermediates and SP- versus LP- 
BER pathways, it should be possible to clarify 
the physiological role of active DNA demeth- 
ylation during cell differentiation, activation, 
and injury. 

Anticancer drugs frequently produce acute 
and sometimes persistent neurological and 
neuropsychiatric symptoms referred to as 
“chemobrain” (78). To date, there are no effec- 
tive treatments or preventive measures for 
chemobrain, and the underlying mechanisms 
are poorly understood. We speculate that per- 
turbed gap-filling synthesis at regulatory ele- 
ments during active DNA demethylation could 
be one potential mechanism contributing to 
chemobrain. Notably, Ara-C is the most com- 
mon chemotherapeutic agent that induces cer- 
ebellar dysfunction (33), and some patients 
have permanent impairment due to Purkinje 
cell loss in the cerebellum (34). Moreover, re- 
cent studies provide evidence that DNA de- 
methylation is highly active in Purkinje neurons 
(26). Considering our findings, it would be 
interesting to determine whether inhibitors 
of TDG, or inhibitors of additional pathway 
components that trigger the DNA damage re- 
sponse, could be promising candidates to alle- 
viate some of the neurological complications 
associated with anticancer drug therapies. 
Note added in proof: A recent study demon- 
strated that SAR sites are enriched for somatic 
mutations detected by ultradeep genome se- 
quencing of individual neurons from normal 
individuals, suggesting that active DNA de- 
methylation may contribute to mutation (35). 
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MACHINE LEARNING 


Mastering the game of Stratego with model-free 
multiagent reinforcement learning 


Julien Perolat*+, Bart De Vylder* +, Daniel Hennes, Eugene Tarassov, Florian Strub, 
Vincent de Boer+, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, 

Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, 

Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, 
Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, 
Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Remi Munos, David Silver, 


Satinder Singh, Demis Hassabis, Karl Tuyls* + 


We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at 
a human expert level. Stratego is one of the few iconic board games that artificial intelligence (Al) 

has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic 
thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique 
underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without 
search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state- 
of-the-art Al methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on 
the Gravon games platform, competing with human expert players. 


rogress in artificial intelligence (AI) has 

been measured through the mastery of 

board games since the inception of the 

field. Board games allow us to gauge and 

evaluate how humans and machines de- 
velop and execute strategies in a controlled 
environment. The ability to plan ahead has 
been at the heart of successes in AI for decades 
in perfect information games such as chess, 
checkers, shogi, and Go, as well as in imper- 
fect information games such as poker and 
Scotland Yard (/-6). For many years, the Strat- 
ego (7) board game has constituted one of the 
next frontiers of AI research (for a visualiza- 
tion of the game phases and game mechanics, 
see Fig. 1). The game poses two key challenges. 
First, the game tree of Stratego has 10°” pos- 
sible states, which is larger than both no-limit 
Texas Hold’em poker, a well-researched im- 
perfect information game with 10'™ states, 
and the game of Go, which has 10° states. 
Second, acting in a given situation in Stratego 
requires reasoning >10°% possible pairs of 
private deployments at the start of the game, 
whereas in Texas Hold’em poker, players are 
dealt one of 10° different two-card hands for 
10° possible private configurations with two 
players. Perfect information games such as Go 
and chess do not have a private deployment 
phase, thus avoiding the complexity that this 
challenge poses in Stratego. Currently, it is not 
possible to use state-of-the-art model-based 
perfect information planning techniques nor 


DeepMind Technologies Ltd., London, UK. 

*Corresponding author. Email: perolat@deepmind.com (J.P.); 
bartdv@deepmind.com (B.D.V.); karltuyls@deepmind.com (K.T.) 
tThese authors contributed equally to this work and are 

co-lead authors. 

+Independent consultant. 


990 2 DECEMBER 2022 + VOL 378 ISSUE 6623 


state-of-the-art imperfect information search 
techniques that break down the game into 
independent situations (5, 6). 

For these reasons, Stratego is a major chal- 
lenge for the AI community and provides a 
hard benchmark for studying strategic inter- 
actions at an unparalleled scale. As in most 
board games, Stratego tests the ability to make 
relatively slow, deliberative, and logical deci- 
sions sequentially. Additionally, in most im- 
perfect information games, other tactics that 
better reflect decision-making processes in 
the real world need to be deployed. As von 
Neumann described it, “real life consists of 
bluffing, of little tactics of deception, of ask- 
ing yourself what is the other man going to 
think I mean to do” (8). Most recent successes 
in large imperfect information games have 
been achieved in real-time strategy games 
such as StarCraft, Dota, and Capture the Flag 
(9-11) or in racing simulation video games 
such as Gran Turismo (72), in which most de- 
cisions must be made quickly and instinctively 


and are of a continuous-time nature. Stratego 
is a game for which little progress has been 
achieved by the AI research community be- 
cause of the many complex aspects of its struc- 
ture. Successes in the game have been limited, 
with artificial agents only able to play at a 
level comparable to a human amateur [see, 
e.g., (13-17)]. 

This work introduces a novel game-theoretic 
method that allows an AI for learning to play 
Stratego in self-play in a model-free manner 
without human demonstration and from 
scratch. This new method resulted in a bot 
called DeepNash that beat previous state-of- 
the-art AI agents and achieved human expert- 
level performance in the most complex variant 
of the game, Stratego Classic. At the core of 
DeepNash is a principled, novel, model-free 
reinforcement learning (RL) algorithm called 
Regularized Nash Dynamics (R-NaD). Our 
method resulting in DeepNash combines 
R-NaD with a deep neural network architec- 
ture to learn a strategy that plays at a highly 
competitive level by aiming to find a Nash 
equilibrium (78) (i.e., an unexploitable strategy 
in zero-sum two-player games). In earlier work, 
it was formally shown that an R-NaD ap- 
proach converges to a Nash equilibrium in 
several classes of matrix games, including 
two-player zero-sum games (19). The present 
work suggests that R-NaD at scale converges 
empirically to an approximate Nash equilib- 
rium in Stratego. Figure 2 illustrates a high-level 
overview of this approach, which underlies 
DeepNash. 

The performance of DeepNash was system- 
atically evaluated against various state-of-the- 
art Stratego bots and human expert players on 
the Gravon games platform. DeepNash con- 
vincingly won against all current state-of-the- 
art bots that have been developed to play 
Stratego, producing a win rate of >97%, and 
it achieved a highly competitive level of play 
against human expert Stratego players on 
Gravon, where it ranked among the top three 
players, both on the year-to-date 2022 (deter- 
mined on 22 April 2022) and on all-time 


Table 1. Evaluation of DeepNash against existing Stratego bots. The numbers are reported from 
DeepNash’s point of view. More games (800) were played against bots that could be run automatically. 


Opponent 


No. of games 


Wins Draws Losses 


800 


99.7% 0.0% 0.3% 


Celsius1.1 800 


97.9% 0.0% 2.1% 


Vixen 800 


‘0 
100.0% 0.0% 0.0% 
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Phase 1: Private deployment 
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a | Colonel 
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Captain 
[) Lieutenant 
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Miner: diffuses Bombs 
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Phase 2: Game play 


8 Bomb: immobile; only captured by Miner 
fal Flag: immobile, game over when captured 


Piece types 


Fig. 1. Stratego is a two-player board game in which players try to capture the opponent's flag. Initially, the players secretly deploy 40 pieces of diverse 
strengths on the board. Then, they take turns moving pieces, possibly encountering an opponent piece that reveals both piece identities, and then the weaker piece is 
removed. Two lakes (indicated in blue) cannot be crossed by any piece. The complete rules are defined by the International Stratego Federation. 


A} Imperfect information 


Replicator dynamics: fe 7 (a*) = = mi (at) (Qi (a') — Dy mt 


Reward transformation: 


Fig. 2. Overview of R-NaD. (A) Overview of the R-NaD approach at scale 
underlying DeepNash, which allows for learning to play the imperfect 
information game Stratego. (B to D) R-NaD learns a policy represented 
by a deep neural network (B) through self-play from scratch (C) and 


leaderboards, with a win rate of 84%. There- 
fore, to the best of our knowledge, this is the 
first time an AI algorithm was able to learn 
to play Stratego at a human expert level. It 
is worth mentioning that this performance 
was achieved without deploying any search 
method, which was a key ingredient of many 
milestone achievements in AI for board games 
in the past. 


Methods 


R-NaD at scale takes an end-to-end learning 
approach to solving Stratego by incorporating 
the learning of the deployment phase, i-e., put- 
ting the pieces tactically on the board at the 
start of a game (Fig. 1), into the learning of the 
game-play phase using an integrated deep RL 
and game-theoretic approach. As with much 
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'B} DeepRL 


'D) Nash equilibrium 


ré(x*, w—*,a',a—*) = r¥(a*,a—*) — nlog ( 


simplest form. 


work in two-player zero-sum games, the pur- 
pose is to learn an approximate Nash equi- 
librium through self-play. In the context of 
two-player zero-sum games, a Nash equilib- 
rium guarantees that the agent will perform 
well, even against a worst-case opponent. Such 
robustness typically allows an algorithm to 
perform well against humans [see, e.g., (3-5)]. 
In perfect information games, search tech- 
niques aided by RL have provided state-of-the- 
art superhuman bots in Go and chess (2, 20). 
However, searching for a Nash equilibrium in 
imperfect information games requires esti- 
mating private information of the opponent 
from public states (3, 6). Given the vast num- 
ber of such possible private configurations in a 
public state, Stratego is computationally too 
challenging for all existing search techniques 


+ (b') Qe. (b*)] 
(a!) ) + mlog (=e) 


Teg 2") 


aims at converging to a Nash equilibrium (D). The approach relies on 
two core ideas to reach convergence: replicator dynamics and reward 
transformation. Their equations are shown for illustrative purposes in their 


because the search space becomes intractable. 
This work therefore chose a different route, 
without search, and proposed a new method 
that combines model-free RL in self-play with 
a game-theoretic algorithmic idea, R-NaD. The 
model-free part implies that R-NaD does not 
build an explicit opponent model-tracking 
belief space (calculating a likelihood of the 
opponent’s state), and the game-theoretic part 
is based on the idea that by modifying the dy- 
namical system underpinning the reinforcement- 
learning algorithm, one can steer the learning 
behavior of the agent in the direction of the 
Nash equilibrium. The main advantage of this 
combined approach is that one does not need 
to explicitly model private states from public 
ones. A complex challenge, on the other hand, 
is to scale up this model-free RL approach with 
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A C_ Replicator dynamics 
Player 2 os 
= Oo 
— c ot 
Head: H_ Tail: T 3: 
®@ “o« 
® 


R-NaD Iteration 
Start with an arbitrary regularization policy: 7 reg 


1. Reward transformation: Construct the trans- 


formed game with: 7) reg 


Dynamics: Run the replicator dynamics until 
convergence tO: Tyr. fix 


Update: Set the regularization policy: 


Tim+1.reg = Tim, fix 


Repeat stages until convergence 


Iteration 1 


Iteration 2 


Lyapunov function 


TO reg 


Fig. 3. Illustrating R-NaD using the matching pennies game. (A) Payoff table. (B) Algorithmic stages. (©) Dynamics and Lyapunov function. 


R-NaD to make self-play competitive against 
human expert players in Stratego, which had 
not been achieved to date. This combined ap- 
proach is illustrated in Fig. 2. 

The following subsections present the learn- 
ing algorithm behind DeepNash, referring to 
the supplementary materials for more techni- 
cal details where relevant. 


Learning approach 


The approach underpinning DeepNash aims 
to learn a Nash equilibrium in Stratego through 
self-play and model-free RL. The idea of com- 
bining the two has been tried before, but it has 
been empirically challenging to stabilize such 
learning algorithms when scaling up to com- 
plex games such as Capture the Flag, Dota, and 
StarCraft (9-11). Some empirical work man- 
ages to stabilize the learning either by training 
against past versions of the agent (9-11) or by 
adding reward shaping (JO, 11) or expert data 
(9) to the training algorithm. Although these 
approaches help, they lack theoretical founda- 
tions, remain difficult to tune, and are rather 
domain dependent. Furthermore, in a game 
such as Stratego, it is difficult to define a loss 
for which minimization would converge to a 
Nash equilibrium without introducing pro- 
hibitive computational obstacles at large scale. 
For instance, minimizing the exploitability 
(21), a well-known quantity that measures the 
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distance to a Nash equilibrium, requires esti- 
mating an agent’s best response during train- 
ing, which is computationally intractable in 
Stratego. However, it is possible to define a learn- 
ing update rule that induces a dynamical sys- 
tem for which there exists a so-called Lyapunov 
function. This function can be shown to de- 
crease during learning and thus guarantees 
convergence to a fixed point. This is the cen- 
tral idea behind the R-NaD algorithm and is 
the successful recipe for DeepNash, which scales 
this approach using a deep neural network. 


R-NaD algorithm 


The R-NaD learning algorithm used in DeepNash 
is an actor-critic method based on the idea of 
regularization for convergence purposes (19), 
which is briefly explained first in the context 
of zero-sum two-player normal form games 
(illustrated on the matching pennies game). 
A normal form game is an abstraction of a 
decision-making situation involving more 
than one agent. Each agent (indexed by 7 € 
{1,2}) needs to simultaneously take an action 
a (in a set of possible actions A’) according to 
a policy nO) (i.e., a distribution over possible 
actions A, after which it receives a game re- 
ward [7 (a', a”)], and then the game is re- 
peated. For convenience, the opponent of 
player 7 is indexed by -7. R-NaD relies on three 
key stages (Fig. 3B), described below. 


In the first stage, the reward is transformed 
based on a regularization policy tyes, which 
induces a modified game with rewards 


where 7 > 0 is a regularization parameter and 
7 is the player index (7 € {1,2}). Note that this 
transformed reward is policy dependent. 

Second, in the dynamics stage, the system 
evolves according to the replicator dynam- 
ics system (22, 23) on this modified game. The 
basic replicator dynamics equations are a de- 
scriptive learning process from evolutionary 
game theory, equivalent to RL algorithms (23), 
which are also equivalent to an instance of 
follow-the-regularized-leader (24) and are de- 
fined as follows: 


= m,(a') |Q,, (a) - 2 m(b') Q, (0') 


(2) 
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where Q' (a’) is the quality or fitness of an ac- 
tion: ie, Q¢(a") = Ey, [r'(n’, x4, a’, a-)]. 
These dynamics reinforce the probability of 
taking actions with high fitness (relative to 
other actions). Because of the reward transfor- 
mation, this system has a unique fixed point, 
n fix, and convergence to it is guaranteed in 
zero-sum two-player games [see (79)], which 
can be ihe by the Lyapunov function: 


Fry, ( 3 So Thin (Gd ‘log “ale. (19). 
t= acd! 

However, this fixed point is not yet a Nash 

equilibrium of the original game. 

In the final update stage, the fixed point 
obtained is used as the regularization policy 
for the next iteration. These three stages are 
applied repeatedly, generating a sequence of 
fixed points that can be proven to converge to a 
Nash equilibrium of the original (unmodified) 
game (19) in zero-sum two-player games, but 
not in all general-sum games. Figure 3C illus- 
trates the R-NaD algorithm on the two-player 
matching pennies game (the payoff table is 
shown in Fig. 3A). The first iteration starts from 
T reg, T] = [0.999, 0.001], (n = 0.2) and the 
replicator dynamics converge to 1} ,,.[H, T] = 
(0.896, 0.104] and nm}, ,,[H, T] = [0.263, 0.737]. 
The right figure shows the evolution of the lo- 
garithm of the Lyapunov function and illus- 
trates that it decreases while learning. Three 
iterations of R-NaD are shown. 


R-NaD at scale 


Our method consists of three components: 
(i) a core training component R-NaD, i.e., the 
model-free RL algorithm presented above, 
which is implemented using a deep convolu- 
tional network; (ii) a component that fine- 
tunes the learned policy to reduce the residual 
probabilities of taking highly improbable ac- 
tions; and (iii) a test-time postprocessing com- 
ponent that uses game-specific knowledge 
[whereas (i) and (ii) are game agnostic] to 
filter out low-probability actions and obvious 
mistakes. 

The following section starts by concisely 
laying out some essential background informa- 
tion on imperfect information games neces- 
sary to understand how R-NaD is scaled to a 
deep learning model. Then, the implementation 
of the three algorithmic stages of R-NaD are 
summarized. A detailed description of R-NaD 
is provided in the supplementary materials. 


Imperfect information games 


In a two-player zero-sum imperfect informa- 
tion game, two players (player 7 = 1 or 7 = 2) 
sequentially interact in turns. At turn ¢, the 
players receive a reward signal (r}r?), and the 
current player 7 = W, observes the game state 
through an observation 0, and selects an ac- 
tion a, according to a parameterized policy 
function x(.|o,). In model-free RL, the trajecto- 
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Fig. 4. Illustration of DeepNash’s assessment of the relative value of material versus information. 
Shown is an illustration of DeepNash's assessment of the relative value of material versus information in two 
human (red) versus DeepNash (blue) matches. (A) Although blue is behind a 7 and 8, no pieces are revealed 
and only two are moved. As a result, DeepNash assesses its chance of winning to still be ~70% (blue indeed won 
this match). (B) Blue to move. DeepNash’'s policy supports three moves at this state, with the indicated 
probabilities shown (the move on the right was played in the actual match). Although blue has the opportunity 
to capture the opponent's 6 with its 9, this move was not considered by DeepNash, likely because the protection 
of 9's identity was assessed to be more important than the material gain. 


Ties T = [(Or, ae, (rir?) n(.loz)), Vi] Ost<tm AE 
the only data the agent will leverage to learn 


the parameterized policy. 


Model-free RL with R-NaD 


The R-NaD algorithm is scaled by using deep 
learning architectures. It performs the same 
three algorithmic stages as before in normal 
form games: (i) the reward transformation 
stage, (ii) the dynamics stage that allows for 
convergence to a fixed point, and (iii) the up- 
date stage in which the algorithm updates the 
policy that defines the regularization function. 

The neural architecture consists of the fol- 
lowing components: a U-Net torso with resid- 
ual blocks and skip connections (25) and four 
heads that are smaller replicas of the torso 
augmented with final layers to generate an 
output of the appropriate shape. The first 
DeepNash head outputs the value function 
as a scalar, and the three remaining heads 
encode the agent’s policy by outputting a 
probability distribution over its actions at 
deployment and during game play. The 
agent architecture is described in detail in 
the supplementary materials. 

The observation is encoded as a spatial ten- 
sor consisting of the following components: 
DeepNash’s own pieces, publicly available in- 
formation about both the opponent’s and 
DeepNash’s pieces, and an encoding of the 


40 last moves. This public information repre- 
sents the types each piece can still have given 
the history of the game. In total, the observa- 
tion contains 82 stacked frames encoded in a 
single tensor. The observation’s detail is given 
in the supplementary materials. 

Given the regularization policy tp reg at 
iteration m and a trajectory, the reward 
transform used at time step ¢ for player 7 


(a,x) =r! nilog (;*e)-.) ifi= U, 
(aor) ) iti 2 WU, 


Ton .xeg (@|Or) 


is r’ 


tTmreg 


and r? + nlog( 


The dynamics stage of the method is com- 
posed of two parts. The first part estimates the 
value function, which is done through an 
adaptation of the v-trace estimator (26) to 
the two-player imperfect information case, 
resulting in a parameter update direction 
Updateyaie. The second part learns the policy 
through the Neural Replicator Dynamics 
(NeuRD) update (27) using a new estimate of 
the state action value based on the y-trace 
estimator, resulting in a parameter update di- 
rection Updatépoticy. These parts are detailed 
in the supplementary materials. 

After a fixed number of learning steps, an 
approximate fixed point policy, tmx, is obtained, 
which is then used as the next regularization 
policy, Nmsireg = Mmix. The three stages are 
repeated using a smooth transition from the 
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Fig. 5. Illustration of 
DeepNash bluffing. Shown 
is an illustration of Deep 
Nash bluffing in three human 
(red) versus DeepNash 
(blue) matches. (A) Positive 
bluffing. (B) Negative bluffing. 
(C) DeepNash makes a 
scout (2) behave like a spy 
and gains material. 


| |) 2 
“| at) 


reward transformation of iteration m to the 
one of step m + 1. 

Directly learning with the above-described 
method leads to convergence to an empirically 
satisfying solution, which, however, is slightly 
distorted by low-probability mistakes. Those 
mistakes appear because the sofimawx projec- 
tion used to compute the policy from the logits 
assigns a nonzero probability to every action. 
To alleviate this issue, the policy is fine-tuned 
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during training by performing additional thresh- 
olding and discretization to the action proba- 
bilities. The supplementary materials provide 
more details on this aspect and also describe a 
few additional heuristics applied at test time 
that remove obvious mistakes from the policy. 
As opposed to the R-NaD model-free training 
algorithm, these heuristics are Stratego specific. 
Qualitatively, they do remove rare mistakes in 
matches against humans, but they do not give 


notable quantitative improvements in self-play 
(see the supplementary materials). 


Results 


This section presents an overview of the eval- 
uation results of DeepNash against both human 
expert players and current state-of-the-art 
Stratego bots. For the former, DeepNash has 
been evaluated on the Gravon platform, a well- 
known online games server popular among 
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Stratego players. For the latter, DeepNash has 
been tested against eight known AI bots that 
play Stratego. A detailed analysis is also pre- 
sented with regard to some of the capabilities 
of the agent’s game play, including deploy- 
ment, bluffing, and trading off of material 
versus information. 


Evaluation on Gravon 


Gravon is an internet platform for human 
players that offers several online games, in- 
cluding Stratego. It is by far the largest online 
platform for Stratego, and is where some of 
the strongest players compete. For more de- 
tails on the platform and its ranking system, 
please refer to the supplementary materials. 

DeepNash was evaluated against top human 
players over the course of 2 weeks in the be- 
ginning of April 2022, resulting in 50 ranked 
matches on Gravon. Of these matches, 42 (84%) 
were won by DeepNash. In the Classic Stratego 
Challenge Ranking 2022 at that time, this cor- 
responded to a rating of 1799, which put 
DeepNash in third place of all ranked Gravon 
Stratego players (the top two ratings were 
1868 and 1831). In the all-time Classic Stratego 
Ranking, this resulted in a rating of 1778, which 
also put DeepNash in the third place of all 
ranked Gravon Stratego players (the top two 
ratings were 1876 and 1823). The rating for this 
leaderboard considers all ranked games going 
back to the year 2002. 

These results confirm that DeepNash reaches 
a human expert level in Stratego only through 
self-play learning and without bootstrapping 
from existing human data. 


Evaluation against state-of-the-art Stratego bots 


DeepNash was also evaluated against several 
existing Stratego computer programs: Probe 
was a three-time winner of the Computer 
Stratego World Championship (2007, 2008, 
and 2010); Master of the Flag won that cham- 
pionship in 2009; Demon of Ignorance is an 
opensource implementation of Stratego with 
an accompanying AI bot; and Asmodeus, Cel- 
sius, Celsius1.1, PeternLewis, and Vixen are 
programs that were submitted in an Austra- 
lian university programming competition in 
2012 (see the supplementary materials for 
more details). 

As shown in Table 1, DeepNash won the 
overwhelming majority of games against all 
of these bots despite not having been trained 
against any of them and only being trained 
using self-play. Therefore, it is not necessarily 
expected that the residual losses against some 
of these bots would vanish even if the exact 
Nash-equilibrium were reached. For example, 
in most of the few matches that DeepNash 
has lost against Celsius1.1, the latter played a 
high-risk strategy of capturing pieces early on 
with a high-ranking piece and thus was try- 
ing to get a significant material advantage. 
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Most often, this strategy does not work, but 
occasionally it can lead to a win. 


Illustration of DeepNash’s abilities 


The only goal of the algorithm behind DeepNash 
is to learn a Nash equilibrium policy and, by 
doing so, to learn qualitative behavior that one 
could expect a top player to master. 

Indeed, the agent is able to generate a wide 
range of deployments, which makes it diffi- 
cult for a human player to find patterns to ex- 
ploit by adapting their own deployment. We 
describe DeepNash’s deployment behavior in 
more detail in the supplementary materials 
(see the additional results section). Deep- 
Nash was able to make nontrivial trade-offs 
between information and material, to execute 
bluffs, and to take gambles when needed. The 
rest of this section illustrates these behaviors 
through matches that were played on Gravon. 

For convenience, the behavior is described 
in a way a human observer might naturally 
interpret it, including terms such as “decep- 
tion” and “bluffing,” which arguably refer to 
mental states that the program does not have. 


Trade-off between information 
and material 


An important tactic in Stratego is to keep as 
much information as possible hidden from an 
opponent to gain an advantage. During cer- 
tain game situations, there will be trade-offs to 
be considered in which a player needs to 
balance the value of capturing an opponent’s 
piece (or even moving a piece), and thus re- 
vealing information about their own piece, 
versus not capturing a piece (or not moving) 
but keeping the identity of a piece hidden. 
DeepNash was able to make such trade-offs 
in extraordinary ways. 

Figure 4A shows a situation in which 
DeepNash (in blue) was behind in pieces (it 
lost a 7 and an 8) but was ahead in inform- 
ation; the opponent in red has its 10, 9, an 8 
and two of its 7’s revealed. Valuing inform- 
ation and material in Stratego is nontrivial 
a priori, but the agent has learned a policy 
through self-play that seems to naturally 
make this trade-off. In the above example, 
DeepNash was behind in material but knew 
the identity of many of the opponent’s high- 
ranked pieces. On the contrary, almost all 
of DeepNash’s remaining pieces had not yet 
moved and its opponent was left in the dark. 
The value function (v = 0.403) credited this 
information asymmetry as an advantage for 
DeepNash (with an expected win rate of ~’70%) 
despite having lesser material on the board. 
This game was won by DeepNash. 

The second example in Fig. 4B shows a sit- 
uation in which DeepNash had the opportu- 
nity of capturing the opponent’s 6 with its 9, 
but this move was not considered, probably 
because protecting the identity of the 9 was 


deemed more important than the material gain. 
The situation also illustrates the stochasticity 
of DeepNash’s policy during game-play. 


Deceptive behavior and bluffing 


In addition to being able to value asymmetry 
of information, one can also expect the agent to 
occasionally bluff to deceive its opponent and 
potentially gain an advantage. The situations 
shown in Fig. 5, A to C, illustrate this ability. 
Figure 5A illustrates positive bluffing, in which 
a player pretended that a piece had higher 
value than it actually did. DeepNash (blue) 
chased the opponent’s 8 with an unknown 
piece, a scout (2), pretending it was the 10. The 
opponent believed that this piece had a high 
chance of being the 10 and guided it next to its 
spy (which could capture the 10). In an at- 
tempt to capture this piece, however, the op- 
ponent lost its spy to DeepNash’s scout. 

A second type of bluff, called negative bluf- 
fing, is shown in Fig. 5B. In contrast to a 
positive bluff, this tactic entails pretending a 
piece is of a lower rank. Here, the movement 
of the unknown 10 of DeepNash (blue) was 
interpreted by the opponent as a positive bluff 
because they tried to capture it with a known 8. 
DeepNash’s move could have been interpreted 
as moving the spy closer to the opponent's 10, 
for example. The opponent instead encountered 
DeepNash’s 10 and lost an 8. 

A more complex bluff is shown in Fig. 5C, 
where DeepNash (blue) brought its unrevealed 
scout (2) close to the opponent’s 10, which can 
be easily interpreted as a spy. This tactic ac- 
tually allowed blue to capture red’s 5 with its 7 
a few steps later, thereby gaining material 
but also preventing the 5 from capturing the 
scout (2), and revealing that it was actually 
not the spy. 


Conclusion 


This work introduces a new game-theoretic 
method at scale that allows for AI to play the 
imperfect information game Stratego from 
scratch in self-play up to a human expert level, 
as illustrated by our bot DeepNash. This model- 
free learning method combines a deep resid- 
ual neural network with the game-theoretical 
R-NaD multiagent learning algorithm. No 
form of search or explicit opponent modeling 
is performed during training, and DeepNash 
only relies on the use of some game-specific 
heuristics at test time. As such, the method 
underlying DeepNash takes a contrasting ap- 
proach to state-of-the-art search-based learn- 
ing methods that have been successfully applied 
to other complex games such as Go and chess 
and to imperfect information games such as 
poker and Scotland Yard. However, because 
of their computational toll and the inherent 
complexity of the Stratego game itself, those 
methods are not applicable to such an elab- 
orate game. 
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The core component behind DeepNash is 
the at-scale implementation of the R-NaD al- 
gorithm. It performs three essential stages in 
an iteration of the algorithm: reward trans- 
formation starting from a random regularized 
policy to define a modified game, subsequent- 
ly applying the replicator dynamics on this 
modified game to converge to a fixed point 
policy, and finally updating the regularization 
policy to this new fixed point. Repeatedly ap- 
plying this three-stage process yields a strategy 
that is empirically difficult to exploit. 

Evaluated against other AI bots, DeepNash 
achieved a minimum win rate of 97%, and in 
the evaluation against human expert players 
on the Gravon platform, DeepNash achieved 
an overall win rate of 84%, which placed it in 
the top-three rank of both the year-to-date 
(2022) and all-time leaderboards. This is 
an extraordinary result that the Stratego 
community did not believe would have been 
possible with current techniques, judging 
by quotes from Thorsten Jungblut (owner of 
the Gravon platform) and Vincent de Boer, 
which can be found in the supplementary 
materials. 

Looking forward, at this stage, there are no 
indications of how R-NaD fares beyond zero- 
sum two-player settings. However, it is rea- 
sonable to assume that it can unlock further 
applications of RL methods in real-world mul- 
tiagent problems with astronomical state spaces 
characterized by imperfect information, which 
are currently out of reach for state-of-the-art 
AI methods to be applied in an end-to-end 
fashion. For example, state-of-the-art methods 
on two-player poker (4) have successfully trans- 
ferred to six-player poker (5). Many applica- 
tions can be found in this larger class of games, 
including crowd and traffic modeling, smart 
grid, auction design, and market problems. 
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QUALITY CONTROL 


The human signal peptidase complex acts as a 
quality control enzyme for membrane proteins 


Andrea Zanotti'+, Joao P. L. Coelho*{+, Dinah Kaylani”, Gurdeep Singh®, Marina Tauber’, 
Manuel Hitzenberger”, Dénem Avci"*, Martin Zacharias’, Robert B. Russell°, 


Marius K. Lemberg'**, Matthias J. Feige?* 


Cells need to detect and degrade faulty membrane proteins to maintain homeostasis. In this study, 
we identify a previously unknown function of the human signal peptidase complex (SPC)—the 
enzyme that removes endoplasmic reticulum (ER) signal peptides—as a membrane protein quality 
control factor. We show that the SPC cleaves membrane proteins that fail to correctly fold or 
assemble into their native complexes at otherwise hidden cleavage sites, which our study reveals to 
be abundant in the human membrane proteome. This posttranslocational cleavage synergizes with 
ER-associated degradation to sustain membrane protein homeostasis and contributes to cellular 
fitness. Cryptic SPC cleavage sites thus serve as predetermined breaking points that, when exposed, 
help to target misfolded or surplus proteins for degradation, thereby maintaining a healthy 


membrane proteome. 


embrane proteins are key to many im- 
portant biological functions but in- 
trinsically vulnerable to misfolding. 
This renders molecular quality con- 
trol indispensable for cellular and 
organism homeostasis (7). For aberrant mem- 


brane proteins to be degraded, they must be 
extracted from the lipid bilayer as it occurs 
during endoplasmic reticulum (ER)-associated 
degradation (ERAD) (2). Proteolysis can fa- 
cilitate this energetically unfavorable reac- 
tion and synergizes with ERAD. One key 
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membrane-integral protease of the ER is the 
signal peptidase complex (SPC), a defining en- 
zyme of the “signal hypothesis” for ER target- 
ing. Its canonical role is to cleave off N-terminal 
signal peptides (3), a process into which a 
recent cryo-electron microscopy structure 
has provided molecular insights (4). However, 
the SPC activity can also be hijacked by vi- 
ruses for maturation of polyproteins (5, 6). 
This indicates that viruses are capitalizing on 
a feature inherent to this enzyme and that 
cleavage by the SPC is not limited to signal 
peptides. 

To address this central open question on the 
substrate range of the SPC, we analyzed the 
whole human proteome for N-terminal SPC 
cleavage sites that are not classical signal pep- 
tides (fig. S1A). This approach identified 262 
membrane proteins that have a predicted SPC 
cleavage site following their N-terminal type I- 
oriented transmembrane (TM) helix (Fig. 1A). 
Candidates include several connexins, gap 
junction proteins for which previous in vitro 
experiments indicated signal peptide-like 
processing (7). Therefore, we first focused on 
this protein family. Disease-causing muta- 
tions distributed over the TM segments of 
connexin 32 (Cx32) and other connexins gave 
rise to two species on SDS-polyacrylamide gel 
electrophoresis (Fig. 1B and fig. S1, B to D). 
Epitope tagging showed that this was caused 
by N-terminal cleavage (fig. S1, C and D). Fur- 
thermore, mutations of predicted SPC cleavage 
sites (fig. S1, E and F) and the SPC inhibitor 
cavinafungin (8) blocked Cx32 processing (Fig. 
1C), together confirming membrane protein 
cleavage by the SPC. However, unlike signal 
peptides, which are cleaved early during bio- 
synthesis (3), Cx32 was processed posttrans- 
locationally (Fig. 1D). 

If the SPC can cleave after TM regions, its 
substrate range is vastly underestimated. We 
thus sought to determine whether the SPC 
also cleaves after internal type II-oriented TM 
helices. Indeed, 1297 human membrane pro- 
teins contain predicted SPC cleavage sites 
downstream of our initial N-terminal search 
window (fig. S2, A and B). Among those, we 
verified SPC-mediated cleavage for periph- 
eral myelin protein 22 (PMP22) (fig. S2C), a 
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central nervous system protein whose muta- 
tions cause human disease, as well as the 
rhomboid pseudoprotease iRhom2 (fig. S2D), 
which controls trafficking and activation of 
the cell-surface sheddase TACE/ADAM17 
(tumor necrosis factor-a converting enzyme/ 
A disintegrin and metalloprotease domain- 
containing protein 17) (9, 10). For iRhom2, 
cleavage occurred several hundred residues 


downstream of its N terminus (fig. S2D), where- 
as PMP22 was cleaved after its third TM helix 
(fig. S2, E to G). 

These findings uncover a much broader 
SPC client range than previously anticipated. 
In silico analyses of predicted-to-be-cleaved 
TM domains revealed clear differences to 
signal peptides in sequence, hydrophobicity, 
and length (fig. $3). Together, this raises the 
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Fig. 1. The SPC posttranslocationally cleaves mutant membrane proteins. (A) Analysis of the first 

70 amino acids of all human proteins using SignalP4.1, exploiting the two different neural networks 

[no recognition of TM regions (noTM) or TM region recognition included (TM); see methods for details]. 
Dark-purple circles indicate hits further analyzed in this study. (B) Analysis of Cx32 mutants in human 
embryonic kidney 293T (HEK2937) cells reveals two species. Hsc70 served as loading control. (C) Immunoblot 
of Cx32°7O-FLAG in HEK293T cells treated with the SPC inhibitor cavinafungin (1 uM), where indicated 
[empty (full) arrowhead: SPC-processed (full-length) Cx32°°!8]. (D) Autoradiograph of immunoprecipitated 
Cx32°7R. EL AG or FLAG-prolactin (Prl), labeled for 2 min and chased for the indicated times (mat: mature 
Pri; gl: glycosylated mature Prl). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; 
C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; |, lle; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg: 

S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. 
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Fig. 2. SPCS1 is critical for membrane protein cleavage. (A) (Top) Immunoblot analysis of Cx32°7°! cleavage 


in HEK293T (Ctrl), SPCS1 knockout (KO), or SPCS1 KO cells transiently reexpressing SPCS1. Actin served as 
loading control. (Bottom) Quantification of cleaved Cx32°?! fraction (mean + SEM; n = 3; *P < 0.05; **P < 0.01). 
Empty (full) arrowhead: SPC-processed (full-length) Cx32°®. (B) (Top) Autoradiograph of immunoprecipitated 
Cx32°1R_FL AG labeled for 5 min and chased for the indicated times in Ctrl or SPCS1 KO cells. (Bottom) 
Quantification of cleaved Cx32°70!® relative to 0 hours (mean + SEM). (C) (Left) Immunoblot analysis of 
ectopically coexpressed SPCS1 mutants and iRhom2-FLAG in SPCS1 KO cells [empty (full) arrowhead: 
SPC-processed (full-length) iRhom2]. (Right) Quantification of iRhom2 cleavage (mean + SEM; *P < 0.05; 
***P < 0.001). (D) Structure of the SPC (4), highlighting SPCS1 residues that affected iRnom2 cleavage. 
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question of how cleavage of membrane pro- 
teins might be regulated. A previous CRISPR 
screen revealed the noncatalytic SPC subunit 
SPCS1 (fig. S4A) to be involved in the matura- 
tion of flavivirus polyproteins (6), indicating a 
possible role in the processing of substrates 
other than signal peptides. Notably, SPCS1 
knockout reduced mutant Cx32 cleavage by 
more than 50%, and complementation with 
exogenous SPCSI reestablished it (Fig. 2A). A 
similar behavior was observed for all other 
tested substrates (fig. S4, B to E), revealing a 
key role for SPCS1 in membrane protein cleav- 
age. Knockdown of other SPC subunits affected 
Cx32 processing to different degrees (fig. S5A), 
suggesting a complex interplay of the sub- 
units, which goes beyond a general destabi- 
lization of the complex when one subunit is 
missing (fig. S5B). Pulse-chase experiments 
confirmed the role of SPCS1 in posttranslo- 
cational Cx32 cleavage (Fig. 2B). In contrast, 
SPCSI1 ablation did not compromise signal 
peptide processing (fig. S5, C to E). Instead, it 
increased secretion of prolactin that contains 
a classical signal sequence (fig. SSF), arguing 
that SPCS1-dependent membrane protein 
cleavage is functionally distinct from its role 
in signal peptide processing (77). Together, 
these findings led us to query SPCS1 in more 
detail. Coimmunoprecipitation experiments 
showed interaction of SPCS1 with iRhom2, 
but not the iRhom2-binding partner TACE 
(fig. S6A). Furthermore, mutations of evolu- 
tionarily conserved lipid bilayer-exposed 
residues within SPCS1 (fig. S6B) substantially 
affected iRhom2 cleavage (Fig. 2, C and D, and 
fig. S6C), while not disrupting SPC assembly 
(fig. S6D). Together, these findings support a 
function of SPCS1 in the recognition and pro- 
cessing of membrane proteins by the SPC. 
The fact that disease-linked membrane 
protein mutants are cleaved by the SPC may 
indicate a quality control function. We thus 
hypothesized that incorrectly folded mem- 
brane proteins might expose otherwise buried 
(i.e., cryptic) SPC cleavage sites and become 
prone to processing. Supporting this idea, every 
connexin mutant we had found to be cleaved 
by the SPC was ER-retained and failed to form 
gap junctions (fig. $7, A and B), indicating 
misfolding. In contrast, inducing ER retention 
of correctly folded wild-type Cx32 only had 
a minor effect on cleavage (fig. S7, C to E). 
If misfolding indeed caused cleavage, stabi- 
lizing the mutation should block proteolytic 
processing. In agreement with this, cleavage 
was significantly reduced for a Cx32@"'® var- 
iant (Cys?°'— Arg) that contained a computa- 
tionally designed second mutation to form 
a stabilizing salt bridge (Fig. 3, A and B). A 
similar behavior was observed for other Cx32 
mutants when secondary mutations stabiliz- 
ing the first lesion were introduced (fig. S8). 
Further supporting the idea that misfolding 
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Fig. 3. Misfolding and failed assembly promote SPC 
cleavage. (A) (Top) Cx32°422-C2018 model with inset 
highlighting D142 and R201. (Bottom) Histogram showing 

the distance between the terminal side-chain carbon atoms 

of D142 and R201, as obtained from molecular dynamics 
simulations (dashed line: 5-A threshold needed for salt bridge 
formation). (B) Immunoblot of the indicated Cx32 mutants 
expressed in HEK293T in the presence of CB-5083 to inhibit 
ERAD. Empty (full) arrowhead: SPC-processed (full-length) 
Cx32. (C) (Top) Autoradiograph of immunoprecipitated 
Cx32°70!R_FLAG labeled for 10 min and chased for the indicated 
times. (Bottom) Quantification of SPC-mediated processing. 
Empty (full) arrowhead: SPC-processed (full-length) Cx32°70/R 
(mean + SEM; n = 3; *P < 0.05; **P < 0.01 relative to control). 
(D) (Left) Immunoblot of hemagglutinin (HA)-tagged 

iRhom2 with increasing amounts of coexpressed FLAG-tagged 
TACE. Full (empty) arrowhead: full-length (processed) 

iRhom2. Actin was used as loading control. (Right) Quantifica- 
tion of the cleaved iRhom2 fractions(mean + SEM; n = 4). 


Fig. 4. SPC action synergizes with ERAD. (A) Endogenous 
SPCS1 was immunoprecipitated from HEK293T (Ctrl) and 
SPCS1 KO cells (nonspecific binding control), and samples were 
immunoblotted for different subunits of the SPC and Hrdl 
complexes. Climp63 served as negative control. Asterisks 
indicate nonspecific bands. (B) (Top) Autoradiograph of 
HEK293T cells ectopically expressing Cx32°77®-FLAG labeled 
for 10 min and chased for the indicated times. Where 
indicated, CB-5083 was added (10 uM) to inhibit p97-mediated 
dislocation. Full (empty) arrowhead: full-length (processed) 
Cx32°7°R (Bottom) Quantification of full-length and cleaved 
Cx32°701R (mean + SEM; n = 3). (C) (Top) Immunoblot of 
HEK293T cells expressing FLAG-tagged Cx32°77" and treated 
with CHX, cavinafungin (5 uM), and/or epoxomicin (epox, 

5 uM). Actin served as loading control. (Bottom) Quantification 
of full-length Cx32°7® relative to time t = 0 (mean + SEM; 

n = 4; *P < 0.05). (D) (Left) Processing of a canonical signal 
peptide. (Right) Quality control function of the SPC, where 
noncanonical, cryptic cleavage sites are processed. SPCS1 
may bind substrate TM domain followed by transition of the 
cleavage site into the SEC11 active site. Cleaved fragments 

are prone to ERAD. 
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causes SPC-mediated cleavage, disrupting a 
structural disulfide bridge in Cx32 increased 
cleavage, whereas other mutations without 
obvious structural consequences did not (fig. 
S9A). Pulse-chase experiments revealed that, 
when disulfide bonds in Cx32 were reduced 
by dithiothreitol (DTT), cleavage increased 
at a higher rate when DTT was added with a 
0.5-hour delay after protein synthesis than if it 
was added initially (Fig. 3C and fig. S9B). This 
increase in Cx32©°"* cleavage was indepen- 
dent of ER stress (fig. S9C). Of note, one of 
the disulfide bonds in Cx32 is close to its SPC 
cleavage site, so correct oxidative folding may 
shield this site (fig. S9D). Together, these find- 
ings further suggest that SPC cleavage serves 
as a posttranslational membrane protein qual- 
ity control step, which may be more efficient 
when early folding factors have dissociated. 
In addition to intrinsic misfolding, failed as- 
sembly of membrane protein complexes may 
lead to subunit instability (12). To test whether 
SPC cleavage is also involved in subunit as- 
sembly and abundance control, we expressed 
iRhom2?2 with its partner TACE. In agreement 
with SPC cleavage being attributable to im- 
balanced subunit stoichiometry, TACE co- 
expression reduced iRhom2 cleavage (Fig. 3D). 
Likewise, a truncation of the ERAD E3 ubiquitin 
ligase Hrd1 that is deficient in assembly with ER 
interaction partners (13) was partially cleaved 
and degraded in an SPCS1-dependent man- 
ner (fig. S10). 

Our data show that SPC cleavage is caused 
by defects in membrane protein folding and 
assembly, which suggests links to ERAD. Pre- 
vious mass spectrometry analyses indicated an 
interaction of the SPC with Hrd1 (/4), which is 
relevant in the light of the SPC quality control 
functions. Extending these findings, coimmu- 
noprecipitation experiments revealed inter- 
action of Hrd1 and its cofactor FAM8A1 with 
the SPC at endogenous levels (Fig. 4A), and the 
SPC also coimmunoprecipitated with Hrd1 
(fig. S11A). In contrast, no pronounced interac- 
tion was observed with another major ERAD 
E3 ubiquitin ligase, gp78 (fig. S11B). Further 
supporting the functional relevance of this 
interaction, cycloheximide (CHX) chase showed 
that Hrd1 plays a role in clearance of SPC- 
generated Cx32°?"!® fragments (fig. S11C). 
Inhibiting p97-mediated retrotranslocation 
during ERAD caused a strong accumulation 
of the SPC-cleaved form of Cx32@°"®, along 
with a modest effect on the full-length protein 
(Fig. 4B). This stabilization of the cleavage 
fragment further indicates that SPC supports 
degradation of mutant membrane proteins, 
the generality of which was corroborated by 
similar results obtained for mutants of PMP22, 
Hrdl, and other connexins (figs. S2C, S1OB, 
and S11D). In agreement with SPC-catalyzed 
cleavage promoting ERAD, cavinafungin led 
to a mild but significant stabilization of full- 
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length Cx32@® in CHX chase assays (Fig. 4C). 
This partial stabilization, which showed addi- 
tive effects with proteasome inhibition, may 
be due to different ERAD pathways involved 
in Cx32 degradation (15), serving as an exam- 
ple of the general redundancy in ER protein 
homeostasis (16). Together, our data show 
that noncanonical SPC cleavage contributes 
to membrane protein quality control, and 
our computational analyses suggest this to 
be widespread. SPC-catalyzed cleavage of TM 
segments may thus be relevant for resilience 
toward protein folding stress in the ER. Con- 
sistent with this, mRNA levels of the SPC sub- 
units were up-regulated upon induction of ER 
stress (fig. S12A), although ER import gener- 
ally drops (17), and SPCS1 knockout cells per- 
formed significantly worse under ER stress 
conditions (fig. S12B). In conjunction with 
the absence of observed detrimental effects 
of SPCS1 ablation on canonical signal peptide 
processing (fig. S5, C to F), this indicates a role 
for the SPC in cellular protein homeostasis. 
This notion was further supported by our 
findings that SPC played a role in rebalancing 
Hrd1 levels after ER stress subsided (fig. S12C). 

This study reveals that the SPC, discovered 
in the 1970s to cleave ER signal peptides (3), 
has a previously unanticipated and wide- 
spread quality control function for membrane 
proteins. Cryptic SPC cleavage sites thus might 
have evolved to serve as predetermined “break- 
ing points” that, when exposed, support tar- 
geting misfolded or surplus proteins toward 
degradation. The quality control activity of 
the SPC helps to remove faulty membrane 
proteins from the cellular pool, to control pro- 
tein assembly, and to mitigate ER stress, to- 
gether increasing cellular fitness. This second 
mode of posttranslocational SPC processing 
(Fig. 4D) shows notable parallels to N-linked 
glycosylation, which can occur co- and post- 
translationally (78). The latter depends on 
recognizing cryptic glycosylation sites within 
otherwise folded domains as part of a triage- 
salvage system (19). Our data suggest that 
the noncatalytic SPCS1 subunit, which has 
recently been shown to be down-regulated 
in Alzheimer’s disease patients’ brains (20), is 
critically involved in the SPC quality control 
functions (Fig. 4D). To protect immature mem- 
brane proteins from SPC-mediated cleavage, 
the large number of membrane-integral and 
soluble chaperones that are in close proximity 
to the Sec61 translocon (21) may be rele- 
vant, whereas folded membrane proteins 
can be expected to have cryptic cleavage 
sites no longer accessible to the SPC (fig. S13). 
In general, the ER translocon appears to 
provide a protected environment for protein 
biogenesis. In contrast, at distal sites and/or 
after a time window that provides an oppor- 
tunity for folding and further transport, pro- 
degradation quality control functions are 


more likely. Extending these concepts, our 
study now reveals a role of the SPC in ER pro- 
tein quality control. 
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VOLCANOLOGY 


Magma accumulation at depths of prior rhyolite 
storage beneath Yellowstone Caldera 


Ross Maguire’, Brandon Schmandt’, Jiaqi Li?, Chengxin Jiang*, Guoliang Li®, 


Justin Wilgus*, Min Chen®” 


Seismic tomography has provided key insight into Yellowstone’s crustal magmatic system that includes 
attempts to understand the melt distribution in the subsurface and the current stage of the volcano’s 
life cycle. We present new tomographic images of the shear wave speed of the Yellowstone magmatic 
system based on full waveform inversion of ambient noise correlations, which illuminates shear wave 
speed reductions of greater than 30% associated with Yellowstone’s silicic magma reservoir. The 
slowest seismic wave speeds (shear wave speed less than 2.3 kilometers per second) are present at 
depths between 3 and 8 kilometers, overlapping with petrological estimates of the assembly depth 

of erupted rhyolite bodies. Assuming that Yellowstone’s magmatic system is a crystal mush with broadly 
distributed melt, we estimate a partial melt fraction of 16 to 20%. 


he Yellowstone volcanic system has fueled 
some of the largest explosive caldera- 
forming eruptions in the geologic record, 
including three catastrophic eruptions 
in the past 2.1 million years (J, 2). Ex- 
plosive silicic eruptions on this scale can have 
widespread environmental impacts, including 
continent-wide ash falls, global climate dis- 
ruption, and extinction events (3, 4). At Yellow- 
stone, the most recent Lava Creek eruption 
(0.64 million years ago) emplaced >1000 km? 
of rhyolitic material and blanketed much of 
the western United States and Great Plains in 
ash (1, 5). The subsequent collapse of the mag- 
ma reservoir shaped the current Yellowstone 
Caldera in northwestern Wyoming, which has 
since been filled with rhyolite flows as young as 
70,000 years old (1). Although it is clear from 
geophysical observations that the modern 
Yellowstone magmatic system remains active 
(6, 7), questions persist about the volume and 
distribution of melt and how it compares with 
conditions that preceded prior eruptions. 

An emerging view of continental magma res- 
ervoirs is that a crystal mush zone (a crystal- 
dominated, partially molten body) can persist 
in the crust over long time scales (100,000 years 
or longer) but that eruptible melt-rich zones 
are likely to be short-lived (<5000 years) (8-10). 
From this perspective, layers of eruptible silicic 
melt rapidly accumulate near the top of the 
crystal mush zone before eruption. Thus, the 
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presence or absence of a melt-rich zone at or 
near the top of the silicic magma reservoir could 
be an important indicator of where Yellow- 
stone currently sits in its eruptive life cycle. 
Seismic tomography provides one of the 
best tools for inferring the presence of melt in 
crustal magmatic systems, and numerous 
previous studies have produced images of the 
subsurface below Yellowstone that reveal a 
contemporary magma reservoir in the mid- 
to upper crust (77-15). The magma reservoir 
in these studies is typically imaged as a slow 
shear wave speed (Vs) anomaly of up to 10%, 
suggesting a relatively melt-poor system (<10% 
partial melt fraction). However, spatially iso- 
lated observations of scattered teleseismic body 
waves recorded at seismometers in or near 
Yellowstone Caldera indicate that the degree 
of partial melt could be much higher (/6). 
Seismic tomography has yielded important 
clues into Yellowstone’s magmatic system, but 
imaging melt-rich zones remains challenging 
because small-scale magma bodies with se- 
verely reduced seismic wave speed are unlikely 
to produce substantial travel time delays of 
first arrivals due to wavefront healing (17). 
Additionally, strong low-wave speed anoma- 
lies may be further diminished in seismic im- 
ages because of assumptions such as locally 
one-dimensional (1D) or ray-based seismic 
propagation and inversion regularization (18). 
Advances in tomographic inversions based on 
3D numerical modeling of seismic waveforms, 
sometimes referred to as “full waveform in- 
version” (FWI), can overcome some limitations 
of conventional methods that rely on asymp- 
totic ray-based approximations (19-27). The 
3D sensitivity kernels used in FWI are able to 
account for complex wave propagation effects 
and thus more accurately map the location and 
amplitude of seismic wave speed anomalies. 
We present new images of the Vs below 
Yellowstone based on FWI of ambient noise 
correlations. To take full advantage of the rich 


and diverse seismic datasets available in the 
Yellowstone region, our images combine data 
from numerous broadband deployments over 
the past 20+ years, including the EarthScope 
Transportable Array, several dense temporary 
deployments, and a recently updated seismic 
network within Yellowstone National Park 
(Fig. 1). Our tomographic inversion approach 
uses vertical component noise correlation func- 
tions (NCFs) from 4991 interstation pairs and 
minimizes frequency-dependent travel-time 
differences between NCFs and 3D synthetic 
waveforms in six overlapping period bands 
between 5 and 30 s (22). The final model my, 
was achieved after 10 adjoint iterations, and 
it reduces the total misfit by ~50% compared 
with the starting model mg, which is based 
on conventional inversion techniques (22). 

The tomographic images (Fig. 2) illuminate 
a strong Vs anomaly corresponding to the 
magma reservoir in the mid- to upper crust 
centered below the Yellowstone Caldera, with 
peak Vg reductions of >30%, which is substan- 
tially stronger than previously recognized. The 
slowest seismic wave speeds (Vg < 2.3 km/s) 
are present at depths between 3 and 8 km, 
with a minimum of 2.15 km/s at 5 km depth. 
Previous ray-based seismic tomography stud- 
ies have imaged a mid- to upper crustal re- 
servoir at ~5 to 15 km depth [for example, 
(13, 14)]; the peak velocity anomaly typically 
lies between 7 and 10 km depth, which is 
deeper than most petrological estimates of the 
storage depth of previously erupted rhyolitic 
magmas. For example, a recent petrological 
study of the Lava Creek Tuff suggested melt 
storage pressures of 80 to 150 MPa, corre- 
sponding to a depth range of ~3 to 6 km (23). 
Similarly, the storage pressure of CO,-rich 
magmas from the Central Plateau Member 
Rhyolites (eruption ages 175,000 to 70,000 years) 
is estimated to be between 90 and 150 MPa 
(24). Thus, our tomographic images suggest 
contemporary magma storage in a depth range 
overlapping with the storage zone of magmas 
that have supplied both explosive and effu- 
sive silicic eruptions at Yellowstone. 

In map view, the maximum Vg reduction is 
offset from the center of the caldera toward 
the east (Fig. 2A) and overlaps with a cluster of 
seismicity below Yellowstone Lake. The max- 
imum depth extent of the low Vs region below 
the caldera is ~30 km, although the anomaly is 
more subdued below 10 km, which suggests that 
melt is most concentrated at shallower reser- 
voir depths. In addition to the low-velocity 
anomaly below Yellowstone Caldera, two other 
low-Vg regions are notable. First is a region of 
low Vg in the lower crust (~35 to 40 km depth) 
that extends to the southwest of Yellowstone 
along the Snake River Plain and appears to 
connect to the anomaly below the caldera (Fig. 
2B). There, Vg reaches 3.5 to 3.6 km/s (approx- 
imately -8 to -9% slower than the regional 
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Fig. 1. Broadband seismic data used in this study. (A) Map of the station distribution. Symbols depict different seismic networks. (B) Record section of vertical 
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Fig. 2. Yellowstone shear wave speed model. (A) Map view of Vs at 5-km depth below the surface. The irregular outline toward the east side of the caldera is Yellowstone 
Lake. (B and C) Vertical cross sections along profiles X-X' and Y-Y', respectively. Seismic events with My > 3.0 that occurred in the past 20 years are plotted as gray 
circles. In vertical cross sections, seismic events within +15 km in the lateral direction are projected onto the slice. (Inset) The Vs profile at the center of the caldera. The gray 
shaded region corresponds to petrologically estimated storage depths of past eruptive reservoirs (22, 23). 
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Fig. 3. Waveform fits between observed and synthetic seismic data. (A to C) Waveforms from virtual 
source TA.H18 [(B), yellow star], filtered between 6 and 9 s. (D to F). Waveforms from a M,, 4.2 event located 
northeast of Yellowstone Caldera, filtered between 6 and 12 s. Observed data are indicated in black, and 
synthetics from the starting model and final model are indicated in red and green, respectively. Earthquake 
data were not used in the seismic inversion and are only shown for model validation. 


average), which is consistent with previous 
studies [for example, (72)]. This region lies 
above exceptionally slow mantle (72) and could 
represent a deep crustal reservoir of basaltic 
melt, although how it connects with the shal- 
low silicic reservoir is unclear. Second, a region 
of low Vg to the southeast of Yellowstone 
Caldera is imaged in the mid-crust with a mini- 
mum Vz of ~3.4 km/s (Fig. 2C), which is en- 
hanced compared with that in previous studies 
[for example, (15)]. 
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Improved waveform fitting compared with 
our tomographic starting model demonstrates 
that the extreme Vs anomaly below the caldera 
is required by the data (Fig. 3 and figs. S8 to 
$10). Short-period waveform fits along paths 
that directly traverse the caldera are most 
notably improved. Shown in Fig. 3, Ato C, isa 
comparison between observed and synthetic 
NCFs filtered between 6 and 9 s for a virtual 
source at station H18A. Although the starting 
model Moy produces good waveform fits for 


most of the paths, a large travel-time delay 
(>5 s) was observed at station IPID, which is 
located directly opposite from Yellowstone 
Caldera so that most of the path samples the 
magma reservoir. The waveform observed 
along this path is well explained by the final 
FWI model myp (Fig. 3C). As further valida- 
tion, we show waveform fits from a local mo- 
ment magnitude (M,,) 4.2 earthquake that 
occurred to the northeast of the caldera on 
25 March 2008 (Fig. 3, D to F). Data from this 
event was not used in the tomographic in- 
version; however, Rayleigh wave travel-time 
misfits were noticeably improved with model 
m,,. The improved fit is most appreciable at 
stations LKWY and YFT, where paths most 
directly sample the magma reservoir (Fig. 3F). 

The presence of partial melt in the crust 
has the effect of reducing seismic wave speed, 
although the relationship between the melt 
fraction and the spatially averaged Vg struc- 
ture as seen with seismic tomography is dif- 
ficult to constrain and depends on temperature, 
composition, and the geometrical organiza- 
tion of melt in the crust. We estimated the 
melt fraction of Yellowstone’s upper crustal 
reservoir using a theoretical model of a solid- 
liquid composite with ellipsoidal melt inclu- 
sions defined by their aspect ratio (22, 25). We 
show the modeled relationship between Vg 
and melt fraction for various aspect ratios, 
calibrated for a rhyolitic composition at 5 km 
depth (Fig. 4A). Silicic partial melt in textural 
equilibrium exhibits a dihedral angle of 20° to 
40°, corresponding to an aspect ratio of 0.1 to 
0.15 (26, 27), which implies a crystal mush with 
a melt fraction of 16 to 20% near the top of the 
upper-crustal reservoir (Fig. 4B). Under these 
assumptions, we estimated the total volume of 
silicic melt in the upper-crustal reservoir to 
be >1600 km? (22). On the basis of this melt 
fraction scaling, previous shear wave tomog- 
raphy models of Yellowstone’s magmatic sys- 
tem (72, 15) would map to ~10% melt or less. If 
melt is organized in networks of thin crystal- 
poor sills (15), the aspect ratio could be less 
than 0.1. However, sills may not contain 100% 
melt, and silicic partial melt is likely to exist at 
grain boundaries in crystal-rich portions of the 
magma reservoir. An aspect ratio of the mag- 
matic system of <0.1 would imply a lower melt 
fraction but a stronger organization of melt 
into layered structures, which could decrease 
the stability of the system because an eruptible 
body could rapidly assemble from intercon- 
nected sills (28). 

Mobilization and eruption of a crystal mush 
is possible when the melt fraction exceeds the 
critical threshold that marks the transition 
from a crystal-supported framework to a fluid 
suspension of crystals, which is accompanied 
by a dramatic viscosity decrease. Estimates of 
the critical melt fraction range from ~35% (16) 
to ~50% melt (29); thus, the melt fraction we 
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studies. (B) Cartoon diagram of Yellowstone's crustal magmatic system from 


estimated is substantially lower than what 
would be expected if a large fraction of the 
Yellowstone reservoir were in the eruptible 
stage of its life cycle. However, the presence of 
small subset volumes of concentrated silicic 
melt cannot be ruled out. For example, fea- 
tures smaller than the minimum seismic wave- 
length (in this case, ~15 km) may not be well 
resolved, suggesting that high-melt fraction 
bodies of several hundred cubic kilometers or 
more could be present in Yellowstone’s mag- 
ma reservoir. Such subset volumes of the mag- 
ma reservoir would be capable of supplying 
eruptions comparable in size with those of the 
170,000- to 70,000-year-old Central Plateau 


depth (km) 
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Early snowmelt and polar jet dynamics co-influence 
recent extreme Siberian fire seasons 


Rebecca C. Scholten’, Dim Coumou?”, Fei Luo’, Sander Veraverbeke* 


The summers of 2019, 2020, and 2021 experienced unprecedented fire activity in northeastern Siberia, 
driven by record high spring and summer temperatures. Many of these fires burned in permafrost 
peatlands within the Arctic Circle. We show that early snowmelt together with an anomalous Arctic front 
jet over northeastern Siberia promoted unusually warm and dry surface conditions, followed by 
anomalously high lightning and fire activity. Since 1966, spring snowmelt has started 1.7 days earlier 
each decade. Moreover, Arctic front jet occurrences in summer have more than tripled in frequency over 
the last 40 years. These interconnected climatological drivers promote extreme fire activity in eastern 
Siberia, including a northward shift of fires, which may accelerate the degradation of carbon-rich 


permafrost peatlands. 


he years 2019, 2020, and 2021 marked 

the three largest fire years since at least 

2001 in eastern Siberia. A considerable 

portion (38%) of the burned area occur- 

ring within the Arctic Circle between 
2001 and 2021 occurred in these 3 years only, 
and almost all (92%) was located in the 
northern larch forests and tundra of central 
and eastern Siberia. Larch trees are among the 
only tree species adapted to poor, shallow 
permafrost soils (2) and they dominate Siberia’s 
northeastern boreal forest. Subsurface, carbon- 
rich fuels make up about 75% of the carbon 
emissions from wildfires in this region (2). 
Northern wildfires are a strong accelerator 
of permafrost thaw (3), leading to emissions 
from both gradual and abrupt thaw processes 
(4). Such emissions are currently not included 
in the carbon budgets underlying the Paris 
Climate Agreement (5). 

The Arctic is warming twice as fast as the 
global average (6, 7), a phenomenon known as 
Arctic amplification. Likewise, fire activity in 
high-latitude regions is intensifying as a result 
of increases in dry fuels (8, 9) and lightning 
(0, 11). Fuels are abundant in arctic-boreal 
ecosystems and fire weather determines fuel 
flammability, ignition potential, and fire spread 


behavior. Lightning from convective thunder- 
storms is responsible for ignition for most 
arctic-boreal burned areas (70). Thus, large 
fire years in boreal regions usually originate 
during short periods of extreme weather (9). 
Warm and dry spring and summer conditions 
also drive circumpolar tundra fires (12). In 
June 2020, a 6-month period of anomalously 
high temperatures in eastern Siberia culmi- 
nated in the highest temperature on record 
(38°C) measured within the Arctic Circle (3). 
Recent studies (13, 14) have linked these ex- 
treme temperatures to anthropogenic climate 
change. Moreover, spring snowmelt occurred 
exceptionally early over Siberia in 2020, con- 
sistent with pronounced downward trends in 
circumpolar Arctic spring snow cover (15). 
Large fires in southeastern Siberia (76) and 
the western US (17) have previously been 
linked to early snowmelt onsets. Early snow- 
melt enables a longer period of fuel drying 
during which ignitions and fire spread can 
occur (17) and influences atmospheric circula- 
tion by governing surface heating and evapo- 
transpiration (18, 19). 

Over the past decades, notable changes have 
occurred in hemispheric-scale summer circu- 
lation in the mid- to high latitudes of the 


Northern Hemisphere (20-23). Although Arc- 
tic amplification is generally more prominent 
during the cold season, it remains pronounced 
over the warm season as well. In summer, 
Arctic amplification is characterized by (i) re- 
ductions in snow cover over land (24), (ii) rapid 
retreat of sea ice over the Arctic ocean (25), 
(iii) enhanced warming at higher altitudes 
over the Arctic (26), and (iv) associated re- 
ductions in the equator-to-pole temperature 
gradient (20). This results in a hemispheric 
warming pattern that peaks near 65°N and 
that is likely attributable to anthropogenic 
carbon emissions (27). This warming pattern 
is likely linked to the observed weakening of 
the midlatitude westerlies and storm tracks 
since 1979, something that is also projected by 
climate models under future emission scenarios 
(20, 21, 28). Further, the warming pattern might 
favor the occurrence of “double jet” states, 
which are characterized by a narrow subtropi- 
cal jet and pronounced Arctic front jet (29). In 
fact, recent work has shown that such double 
jet states have become more frequent and 
more persistent over the last 40 years, with 
important implications for weather extremes 
in the midlatitudes (21, 27, 29-37). Potential 
effects on the climate-sensitive high latitudes, 
however, remain poorly understood. 


Extreme fire years 


We analyzed fire data from the Moderate 
Resolution Imaging Spectrometer (MODIS) 
Collection 6 burned area (32) and active fire 
(33) products for the years 2001 to 2021 and 
extracted weeks with extreme fire activity over 
the study area (Fig. 1), based on deviations 
from the climatology of the burned area. Be- 
tween 2001 and 2021 we found a total of 
36 weeks during June, July, and August when 
the burn anomaly in northern (>56°N) larch 
forests of Siberia exceeded the climatological 
average by more than one standard deviation 
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Fig. 2. Composite plots of synoptic and surface weather anomalies 
during weeks of high fire activity in northeastern Siberia. (A) 2-m 
temperature anomaly, (B) total precipitation anomaly, (€) 2-m vapor 
pressure deficit anomaly, (D) fire weather index anomaly, (E) 500-hPa 
geopotential anomaly, (F) 250-hPa wind anomaly. Weeks of high fire 


(SD), half of which occurred during the sum- 
mers of 2019 to 2021 (table S1). Burning during 
these extreme fire weeks accounted for 41% of 
the total burned area between 2001 and 2021 
(Fig. 1A and table S1). The year 2021 was the 
largest fire year on record, comprising 16% of 
the burned area in the Siberian larch forests 
between 2001 and 2021, followed by 2019 and 
2020 with 10 and 9%, respectively. In addition, 
2019 and 2020 featured extraordinarily large 
burned areas in northern tundra regions (Fig. 
1B). The anomalously large burned area was 
also associated with elevated fire intensity 
as measured by satellite-derived fire radiative 
power (fig. S1). 


Compound drivers of extreme fire years 


We used meteorological data from the fifth 
generation ECMWF atmospheric reanalysis, 
ERAS (34), to assess surface and synoptic 
weather anomalies during extreme fire weeks. 


1006 2 DECEMBER 2022 + VOL 378 ISSUE 6623 


for false discoveries. 


Extreme fire weeks were anomalously warm 
and dry (Fig. 2, A and B) over northeastern 
Siberia, as compared to the climatology of 1979 
to 2021. These hot and dry conditions are also 
reflected in higher levels of water vapor pres- 
sure deficit and fire weather index (Fig. 2, C 
and D), which are excellent predictors of fire 
activity influencing ignition efficiency and fire 
spread (35). Positive geopotential height anoma- 
lies over Siberia during extreme fire weeks 
(Fig. 2E) suggest that blocking events con- 
nected to large scale atmospheric circulation 
are causing these conditions favorable for fire 
ignition and spread. 

Composites of the 250-hPa total wind anom- 
aly reveal an Arctic front jet during the un- 
usually warm and dry extreme fire weeks (Fig. 
2F and fig. $2). An Arctic front jet state rep- 
resents a northward displaced polar jet at 
around 70°N persisting over a wide range of 
longitudes. The Arctic front jet pattern is char- 


-0.6 


activity were defined as weeks with burned areas exceeding the mean plus 
one standard deviation of the climatology for June, July, and August in 
2001 to 2021 (n = 36). Stippled areas show significant (a = 0.05) differences 
between the composite and the climatology based on t tests corrected 


acterized by negative total wind anomalies in 
the midlatitudes and positive anomalies along 
the Arctic coastline, particularly in Siberia. We 
computed the pattern correlation between the 
circumpolar Arctic front jet pattern during 
extreme fire weeks (Fig. 2F) and the weekly 
250-hPa wind anomaly over the same area to 
quantify the presence of an Arctic front in any 
given week (supplementary text). We found 
strong annual correlations between total sum- 
mer fire activity and Arctic front jet pattern 
correlation averaged over all summer weeks 
(p = 0.66, P = 0.001). 

In 2020, snowmelt occurred on average 
4 days (SD = 9 days) earlier than the 2001 to 
2021 climatology in Siberian larch forests and 
tundra, and 8 days (SD = 7 days) earlier in the 
Arctic Circle of northeastern Siberia (fig. $3). 
Snowmelt timing is significantly correlated to 
summer fire activity (p = -0.68, P = 0.001) and 
is a strong predictor of extreme fire weeks in 
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Fig. 3. Compound effect of spring snowmelt and Arctic front jet occur- 
rence on burn anomaly in eastern Siberia between 2001 and 2021. Data 
points represent weekly averages in (A) boreal forest and (B) tundra 
ecosystems. Dashed gray lines denote the zero anomalies in snowmelt and the 
zero correlation with the Arctic front jet pattern. (C) Influence of snowmelt (blue) 


Siberia, especially when an Arctic front jet is 
present. While the probability for an anoma- 
lous fire week was 2% in years with late snow- 
melt (negative snowmelt anomaly), it increased 
to 25% in years with early snowmelt (positive 
snowmelt anomaly), and to 44% for weeks 
with a positive Arctic front jet pattern corre- 
lation in addition to an early snowmelt (upper 
left corner in Fig. 3A). The recent extreme 
years 2019 to 2021 were all characterized by 
an earlier-than-usual snowmelt and featured 
several Arctic front jet occurrences in summer 
(Fig. 3A). A linear model to estimate the an- 
nual burn anomaly from snowmelt and the 
Arctic front jet pattern correlation yielded an 
R? of 0.77 (P < 0.001), with 23.3% of the var- 
jiation explained by snowmelt timing, 58.3% by 
the Arctic front jet pattern, and 18.4% by their 
combined influence. These results were con- 
firmed by a weekly burn anomaly model, 
which was trained out-of-sample to avoid over- 
fitting (supplementary text). The influences 
of both drivers were further consistent across 
different terrain types (fig. S4). The influence 
of early snowmelt is particularly important in 
the far north whereas the Arctic front jet is 
especially important in driving fires south of 
the northern treeline (Fig. 3C). 


Northward shift of lightning and fires 


The compound effect of early snowmelt and 
Arctic front jet occurrences also promoted 
excessive burning in northern tundra ecosys- 
tems (Fig. 3B). Of the total burned area above 
the northern treeline in northeastern Siberia 
between 2001 and 2021, 75% occurred between 
2019 and 2021. The probability for anoma- 
lous fire activity above the northern treeline 
was 16% for summer weeks during which an 
Arctic front jet had formed after early snow- 
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melt, compared with only 1% for summer 
weeks without Arctic front jet and late snow- 
melt (compare upper left corner and lower 
right corner in Fig. 3B). Compared with 2019 
(272 km?) and 2020 (694 km”), 2021 featured 
less tundra burning (124 km?) as a result of 
the more southward position of the Arctic 
front jet and its associated high-pressure re- 
gion (fig. S5). 

Fires in boreal forest and tundra regions are 
usually ignited by lightning strikes or human 
activity. In Yakutia, which comprises 68% of 
the total burned area within our study area, 
recent studies estimated that approximately 
50% of the burned area has an anthropogenic 
cause, 31 to 43% stems from lightning strikes, 
and the remainder results from overwintering 
fires or has unknown cause (36, 37). However, 
in sparsely populated northern tundra ecosys- 
tems, ignitions are rare and lightning is the 
source of most burned areas (JO, 11). We ana- 
lyzed Global Lightning Detection (GLD360) 
lightning data for 2012 to 2021 to assess 
whether early snowmelt and Arctic front jet 
formation increased boreal forest and tundra 
lightning activity. Anomalously high lightning 
activity (exceeding the 2012 to 2021 climatol- 
ogy by more than one SD) was more than 
three times as likely during extreme fire weeks 
(probabilities: 0.38 for boreal forest, 0.33 for 
tundra) compared with other weeks (proba- 
bilities: 0.12 for boreal forest, 0.08 for tundra). 
More than half (54%) of all lightning strikes 
above the northern treeline between 2012 and 
2021 occurred during these extreme weeks (fig. 
S6). The blocked anticyclone, associated with 
an Arctic front jet, creates favorable conditions 
for the buildup of strong convection needed 
for the generation of lightning-rich thunder- 
storms. Anomalous lightning activity was as- 
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and Arctic 5 front jet pattern (red) on fire activity in distance from treeline 
(binned to 100 km, 0 denotes all tundra regions). Darker dots represent 
significant correlations (a = 0.05), lighter dots insignificant correlations. Lines 
are fitted using loess regression (span = 5) for Arctic front jet occurrence and 
linear regression for snowmelt. 


sociated with Arctic front jet occurrences in 
boreal forest (p = 0.24, P = 0.005) and tundra 
ecosystems (p = 0.29, P = 0.001, fig. S6C). Like- 
wise, spring snowmelt timing was inversely 
related to lightning activity in June and July 
in both boreal forest and tundra regions (fig. 
S6C), indicating that an early snowmelt incites 
early summer lightning anomalies. 


Climate warming and extreme fires 


Regional snowmelt in Siberia’s larch forests 
has started 1.7 days earlier each decade since 
1966 (P = 0.001), with even faster acceleration 
near the treeline (Fig. 4, A and B). To quantify 
potential trends in the frequency of the Arctic 
front jet pattern, we performed a sensitivity 
analysis using various thresholds for the pat- 
tern correlation, with the trend of the median 
threshold shown in Fig. 4C. Our analysis re- 
veals an upward trend in the frequency of the 
Arctic front jet pattern throughout summer 
with 0.5 days (SD = 0.2, fig. S7) per decade 
since 1979. This translates to an increase in 
the average frequency of Arctic front jet states 
from 1.0 week (SD = 0.3) per year in 1980 to 
3.1 weeks per year (SD = 1.0) in 2020, repre- 
senting a tripling (average ratio: 3.3, SD = 0.6) 
in frequency over the last 40 years (fig. S7B). 
The underlying mechanisms behind these 
changes in Arctic front jets are not yet fully 
clear. They might be attributable to changing 
dynamics driven by a changing pole-to-equator 
temperature gradient but may also result from 
thermodynamical changes such as the thermal 
expansion of the lower troposphere with warm- 
ing or regional feedbacks driving blockings 
(38). Nevertheless, the upward trends as re- 
ported here are consistent with those based 
on more advanced circulation analyses using 
self-organizing maps (29). 
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Fig. 4. Trends of the snowmelt timing and frequency of the Arctic front 
jet pattern. (A) Annual snowmelt timing averaged over the study area using the 
National Oceanic and Atmospheric Administration climate data record weekly 
snow cover product (NSIDC, filled circles) and the MODIS snow cover product 
(triangles). The linear trend was based on NSIDC data. (B) NSIDC snowmelt 
trend in relation to distance from treeline (binned to 100 km). All individual 


Snowmelt and atmospheric circulation are 
strongly interconnected. Atmospheric circula- 
tion can accelerate snowmelt by governing the 
location of high- and low-pressure regions. For 
example, the early snowmelt in eastern Siberia 
in 2020 has been attributed to a strong strato- 
spheric polar vortex in the preceding winter 
and spring that prevented cold Arctic air from 
penetrating into the midlatitudes (13). In ad- 
dition, we show here that Arctic front jet states 
are associated with warmer and drier air over 
northern land areas, which may contribute to 
accelerating snowmelt in early summer. In re- 
turn, the timing and magnitude of snowmelt 
influences atmospheric circulation through its 
influence on soil moisture and surface albedo. 
A darker surface and drier and warmer soils 
induced by early snowmelt inhibit evapora- 
tive cooling leading to a warmer atmosphere 
and positive geopotential height anomalies 
(8, 19). Furthermore, stark thermal contrasts 
between the fast-warming land surface and 
the cooler Arctic ocean facilitate the forma- 
tion of an Arctic front jet in early summer 
(18, 21, 29, 39). We found that an earlier snow- 
melt was associated with an earlier formation 
of Arctic front jets as indicated by a significant 
negative correlation between snowmelt tim- 
ing and Arctic front jet development in the first 
3 weeks of June (p = —-0.25, P = 0.02). In our 
study region, snowmelt usually started in the 
beginning of May (average day of year when 
10% of areas are snow-free, day 128 or 7/8 May), 
with 50% of areas being snow-free by end of 
May (average day of year 147 or 26/27 May), 
and 90% by mid-June (average day of year 
166 or 14/15 June). Increases in Arctic front 
jet states in June may therefore be partly in- 
terconnected with accelerating snowmelt. By 
contrast, we did not find any statistical rela- 
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Trend (days/decade) 


sensitivity analysis of the 
intervals of the trends. 


tionships between snowmelt timing and Arc- 
tic front jet occurrence in July and August. 
We have shown that Arctic front jets to- 
gether with early snowmelt favor extreme wild- 
fire conditions and that both drivers have 
strengthened over the past decades. This strong- 
ly contributes to increasing fire activity over 
northern high latitudes (8, 40). Arctic front jets 
are often accompanied by a sharp subtropical 
jet and such double jet states are associated 
with persistent blocking in the midlatitudes 
(29, 31, 41). They can incite persistent weather 
since they support the formation of wave- 
guides which trap and focus free-traveling 
Rossby waves, enabling high-amplitude, quasi- 
stationary waves, for example through reso- 
nance processes (30). Anticyclonic blocking 
has long been identified as a driver of large 
burned areas, as observed in boreal North 
America (42), since it generates tinder-dry 
fuels and abundant ignitions from thunder- 
storms. Summer blocking events are widespread 
in northern Eurasia (43) but understanding 
of the diverse geneses of blocking events re- 
mains limited, which hampers accurate mod- 
elling of blocking in climate models and 
medium-range forecasts (44-46). Eastern 
Siberia may be particularly susceptible to 
intensifying fire activity because it experiences 
faster and more widespread accelerations of 
snowmelt than other Arctic regions (15). Fur- 
thermore, previous research suggests that 
blocked anticyclones over eastern Siberia might 
be linked to decreases in sea ice and snow cover 
(18), and to the incitement of a wave train 
pattern in June driven by positive snow water 
equivalent and soil moisture anomalies in 
western Siberia (47). In addition, intensifying 
fire seasons can have legacy effects through 
the occurrence of overwintering fires (48). 


trends (dots) are significant (a = 0.05). (C) Frequency of the Arctic front jet 
(in weeks), estimated based on the pattern correlation of weekly 250-hPa wind 
anomalies with the Arctic front jet pattern (Fig. 2F). An Arctic front jet pattern 
was assumed for correlations exceeding 0.18 (see fig. S7 and methods for 


hreshold). Shaded areas show the 95% confidence 


Overwintering fires are increasing in number 
in eastern Siberia where they accounted for 
7.5% of burned areas in 2020 (37). 

The extension of fire activity into northern 
tundra regions may incite strong carbon cycle 
feedbacks. Carbon emissions from large tun- 
dra fires considerably surpass decades of 
carbon accumulation of pan-Arctic tundra 
ecosystems (49). Furthermore, tundra fires 
may promote thermokarst development (3), 
and catalyze shrub expansion into tundra re- 
gions (50). We have shown that drivers of recent 
tundra fire extremes in northeastern Siberia, 
including snowmelt and atmospheric circula- 
tion, are sensitive to climate warming. This 
may imply that the region will see a rapid in- 
tensification of fire activity in the future. Our 
work therefore calls for integrating changes in 
atmospheric circulation patterns when project- 
ing future fire regimes in the high latitudes 
and move beyond simple extrapolations of rela- 
tionships between regional weather and cli- 
mate and fire activity. This is needed to fully 
comprehend the accelerating role of fire on 
rapidly changing Arctic-boreal ecosystems and 
their potential carbon losses. 
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ULTRACOLD CHEMISTRY 


Creation of an ultracold gas of triatomic molecules 
from an atom-diatomic molecule mixture 


Huan Yang"+, Jin Cao’?+, Zhen Su*?, Jun Rui??, Bo Zhao’?>*, Jian-Wei Pan??3* 


In recent years, there has been notable progress in the preparation and control of ultracold gases of 
diatomic molecules. The next experimental challenge is the production of ultracold polyatomic molecular 
gases. Here, we report the creation of an ultracold gas of °Na*°K, triatomic molecules from a mixture 
of ground-state sodium-23-potassium-40 (7Na*°K) molecules and potassium-40 (°K) atoms. The 
triatomic molecules were created by adiabatic magneto-association through an atom-—diatomic molecule 
Feshbach resonance. We obtained clear evidence for the creation of triatomic molecules by directly 
detecting them using radio-frequency dissociation. Approximately 4000 triatomic molecules with a 
high-peak phase-space density of 0.05 could be created. The ultracold triatomic molecules can serve as 
a launchpad to probe the three-body potential energy surface and may be used to prepare quantum 


degenerate triatomic molecular gases. 


Itracold molecules offer an ideal plat- 
form to study chemical reactions at the 
quantum level, quantum simulation of 
many-body problems in condensed- 
matter physics, and precision measure- 
ment of fundamental constants (J, 2). In recent 
years, there has been notable success in the 
preparation and study of ultracold gases of 
diatomic molecules. These include the produc- 
tion of quantum degenerate diatomic molec- 
ular gases (3-5), the realization of quantum 
dipolar molecular gases with tunable interac- 
tions (6, 7), the observation of atom-diatomic 
molecule Feshbach resonances (8-10), and the 
detection of reaction products and interme- 
diate complexes of ultracold reactions (11). 
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After the great success of diatomic mole- 
cules, the next experimental challenge is to 
prepare and control ultracold triatomic molec- 
ular gases. Ultracold triatomic molecules will 
open up many research opportunities. For 
example, triatomic molecules provide an ideal 
platform to study the quantum-mechanical 
three-body problem. The collisions involving 
triatomic molecules provide a sensitive probe 
of the four-body, five-body, and six-body po- 
tential energy surfaces, which are extremely 
difficult to calculate with high accuracy for 
heavy molecules. Triatomic molecules have 
more freedom to control, thus offering a pre- 
viously unrealized knob for quantum simula- 
tion. Despite these research opportunities, the 
preparation and control of ultracold triatomic 
molecular gases are extremely difficult. 

Two methods are usually used to prepare 
ultracold molecules. One method is direct 
cooling, such as buffer gas cooling, optoelec- 
trical cooling, or laser cooling (12-16). For 
polyatomic molecules, it has been reported 
that the CaOH molecules were laser cooled 
and trapped in a magneto-optical trap with a 
temperature of ~100 uK and a peak phase- 
space density of ~10°-” (17). Another method 
is to form molecules from ultracold atomic 
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gases. In the past two decades, various ultra- 
cold diatomic molecules have been created 
by magneto-association and photoassociation 
(18-28). The advantage of ultracold associa- 
tion is that the molecules can inherit the low 
temperatures and the high density of atomic 
gases, and thus the molecular gases can have 
a high phase-space density. With the success 
of the formation of diatomic molecules, sev- 
eral theoretical groups have started to consider 
the feasibility of the creation of triatomic mol- 
ecules from ultracold atom-diatomic molecule 
mixtures (29-31). However, the complexity 
of triatomic molecules makes a quantitative 
analysis extremely difficult. 

The observations of the Efimov resonances 
between atoms and weakly bound diatomic 
molecules (32), the Feshbach resonances be- 
tween weakly bound diatomic molecules (33), 
and the Feshbach resonances between atoms 
and ground-state diatomic molecules (8-10) 
open up the possibility of the creation of poly- 
atomic molecules because the polyatomic 
bound state coincides with the scattering state 
close to the resonances, and the coupling 
strength between them is resonantly enhanced. 
Recently, the association of Efimov trimers 
has been demonstrated by observing the radio 
frequency (rf)-induced loss of atoms in an 
atom-dimer mixture (34) or by analyzing the 
decay dynamics after converting a strongly 
interacting atomic gas into molecular states 
(35). For the Feshbach resonance between 
atoms and ground-state diatomic molecules, 
the association of triatomic molecules has been 
demonstrated by measuring the loss of di- 
atomic molecules induced by the rf field (36). 
The rf loss spectrum provides indirect evidence 
for the existence of triatomic molecules and 
can be used to measure the binding energy. 
However, the direct detection of triatomic mol- 
ecules and the preparation of an ultracold gas 
of triatomic molecules remain elusive. 

Here, we report the creation of an ultracold 
gas of weakly bound 7?Na*°K, triatomic mole- 
cules from a mixture of ??Na*°K ground-state 
molecules and “°K atoms. The method of the 
creation and detection of the triatomic mole- 
cules used in our work is illustrated in Fig. 1. 
The triatomic molecules were created by adia- 
batic magneto-association by means of ramp- 
ing the magnetic field through a Feshbach 
resonance between 7*Na*°K molecules and 
49K atoms. We obtained clear evidence for the 
creation of triatomic molecules by directly 
detecting them using the rf dissociation. About 
4000 triatomic molecules at a temperature of 
~100 nK could be created, with a peak density 
of ~3 x 10" cm ®. The ultracold triatomic mo- 
lecular gas had a peak phase-space density of 
~0.05, which is ~10 orders of magnitude larger 
than that of the laser-cooled triatomic mole- 
cules. Our work may largely improve the 
understanding of complicated atom-molecule 
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Feshbach resonance, which is difficult to de- 
scribe quantitatively because of the high den- 
sity of states (9). Our work may also open up 
the possibility of the preparation of Bose- 
Einstein condensates of triatomic molecules 
and the production of ground-state ultracold 
triatomic molecules. 

Our experiment started with the prepa- 
ration of a quantum degenerate mixture of 
?3Na*°K molecules and “°K atoms. The exper- 
imental setup has been introduced in previous 
works (5). The experimental procedures for 
preparing the ultracold mixture are described 
in the supplementary materials. In brief, we 
first prepared a mixture of nearly pure Bose- 
Einstein condensate of 7*Na atoms and deeply 
degenerate Fermi gas of “°K atoms with T/T ~ 0.2 
(where T is the temperature of the atoms and 
Tr is the Fermi temperature) at a temperature 
of ~100 nK. The ground-state ’Na*°K molecules 
were created by magneto-association followed 
by stimulated Raman adiabatic passage at a 
magnetic field of 77.6 G. The degeneracy of a 
pure 7’Na*°K molecular gas was T/T; = 0.4: to 
0.5, which was determined by fitting the two- 
dimensional (2D) column density distribution 
to the Fermi-Dirac distribution. After remov- 
ing **Na atoms, we obtained an ultracold 
mixture of ??Na*°K molecules and “°K atoms. 


A atom-diatomic-molecule 
Feshbach resonance 


Energy 


) Magnetic field 


23nd 44K 
|0,0,-3/2,-4)+ |9/2,-7/2) 


|0,0,-3/2,-4)+ |9/2,-9/2) 


23NaK2 


radio-frequency dissociation 


Fig. 1. Illustration of the magneto-association 
and the rf dissociation of triatomic molecules. 
(A) In the vicinity of the atom-diatomic molecule 
Feshbach resonance, by adiabatically ramping the 
magnetic field across the resonance from above 
the resonance to the triatomic molecular side, 
pairs of @°Na*°K molecules and “°K atoms can be 
coherently converted into *°Na*°K, triatomic 
molecules. (B) The “°Na*°K, triatomic molecules 
ie below the |0,0, —3/2, —4), |9/2, —9/2) 
scattering state with the binding energy F,. By 
applying an rf pulse driving the atomic transition 
9/2, -9/2)|9/2, -7/2), the 7°Na*°K, triatomic 
molecules can be dissociated into free *°Na*°k 
molecules and “°K atoms when the rf is larger than 
a threshold equal to the sum of the atomic 
transition and the binding energy. 


The 7?Na*°K molecules were prepared in the 
maximally polarized state |v,N,mya,™x) = 
0,0, —3/2, —4), where the first two quantum 
numbers are the vibrational and rotational 
quantum numbers, respectively, and 7y,, and 
Mx represent the nuclear spin projections of 
Na and K along the magnetic field. The *°K 
atoms were prepared in the lowest hyperfine 
state |f, mp)” = |9/2, —9/2), which is also the 
maximally polarized state (where fis the quan- 
tum number of the total angular momentum). 
For this hyperfine state combination, there is a 
broad atom-molecule Feshbach resonance at 
~48 G (9, 37). 

We first characterized the Feshbach reso- 
nance by measuring the binding energies of the 
triatomic molecules through the rf loss spec- 
trum. The experimental details are described in 
the supplementary materials (section 2). The 
measured binding energy /, as a function of 
the magnetic field B is shown in Fig. 2. The 
data were fitted to the model \/2m,E,/h” = 
1/(Qpe — @) + T'/{2G|E, + Su(B— B.)|} (38), 
where ™m, is the reduced mass, / is the Planck 
constant, Qpg = —692dp is the background scat- 
tering length measured in (37), and @ = 74:ao is 
the mean scattering length. The fitting pa- 
rameter du is the relative magnetic moment, B, 
is the magnetic field at which the bare bound 
state crosses the scattering threshold, and 
is the Feshbach coupling strength. The fit- 
ting yielded 6u = 1.5()pg, B, = 44.2() G, and 
T = 1.6(2) MHz. We obtained the resonance 
position By = 48.2(5) G and the resonance 
width A = 4.5(5) G, which agreed well with the 
results determined from the measurement 
of the elastic scattering cross sections (37). 
The strength of the Feshbach resonance is 
given bY Sres = ApgAdu/AE yay * 7 where Eyaw = 
12.5 MHz is the van der Waals energy (39). The 
strength s,¢, > 1 means that this resonance is 
open channel dominated. 

After characterizing the Feshbach resonance, 
we studied the adiabatic magneto-association 


Eb (kHz) 


445 450 455 460 465 47.0 
Magnetic field (G) 


Fig. 2. The binding energy of the triatomic mole- 
cules as a function of magnetic fields. The 
binding energy of the triatomic molecules was 
measured using the rf loss spectrum. The experi- 
mental details are described in the supplementary 
materials. The data were fitted to the model in (38) 
(see the text for details). Error bars are smaller than 
the points, so they are not shown. 
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of triatomic molecules. To perform magneto- 
association in the vicinity of the atom-molecule 
Feshbach resonance, we needed to adiabat- 
ically ramp the magnetic field from above the 
resonance to the triatomic molecular side. In 
this process, the atom-diatomic molecule pairs 
could be coherently converted into triatomic 
molecules. Magneto-association has been the 
primary method to form diatomic molecules 
from ultracold atomic gases (39, 40). However, 
it is not clear whether magneto-association 
will work for triatomic molecules because of 
some unknown difficulties. The first one is 
that the conversion efficiency critically de- 
pends on the phase-space density of the atom- 
diatomic molecule mixture. Efficient creation 
of molecules can only be achieved for a high 
initial phase-space density. Previously, we tried 
to use magneto-association to form triatomic 
molecules in an atom-molecule mixture at 
~500 nK with no success (41). The second 
difficulty is that the lifetime of the triatomic 
molecules is unknown, which is usually be- 
lieved to be short. If the magnetic-field sweep 
is too slow, the triatomic molecules will be 
lost and cannot be detected. However, if the 
magnetic-field sweep is too fast, the molecules 
cannot be efficiently formed. In the current 
work, these two difficulties were mitigated 
by preparing a quantum degenerate atom- 
molecule mixture and choosing an open 
channel-dominated Feshbach resonance to 
form the triatomic molecules. However, it is 
not clear whether these advances are suffi- 
cient for the magneto-association to work. 
We first searched for the possible formation 
signal by monitoring the loss of 7?Na*°K mole- 
cules induced by ramping the magnetic field 
across the resonance. To this end, after prepar- 
ing the mixture of 7?Na*°K molecules and “°K 
atoms at 77.6 G, we quickly changed the mag- 
netic field to 55 G and then ramped the mag- 
netic field downward from 55 to 45 G ata 
speed of ~5 G/ms. After a variable hold time, 
we removed the “°K atoms using a 200-us res- 
onant light pulse. The magnetic field was then 
ramped to 77.6 G, and the ?*Na*°K molecules 
were transferred back to the Feshbach state 
for detection. The magnetic field between 55 
and 45 G as a function of time is shown in the 
inset of Fig. 3. The ramp started at 1 ms. After 
ramping the magnetic field across the reso- 
nance, we observed a sharp decrease in the 
number of 7*Na*°K molecules. About 73% of 
the 7°Na*°K molecules were lost (Fig. 3). For 
such a high ramping speed, the large frac- 
tional loss of the 72Na*°K molecules could not 
be completely caused by inelastic collisions 
between the *Na*°K molecules and the *°K 
atoms. To see this clearly, we ramped the mag- 
netic field upward at the same speed. We only 
observed that ~40% of the 7*Na*°K molecules 
were lost. From the difference in the fractional 
loss between the downward and upward ramps, 
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Fig. 3. The loss of 72Na*°K molecules induced 
by ramping the magnetic field across the 
resonance. The magnetic field was swept from 

55 to 45 G at a speed of ~5 G/ms. The ramp started 
at 1 ms. When the magnetic field crossed the 
atom-molecule Feshbach resonance, the number 
of *Na*°K molecules decreased abruptly. The solid 
line is the fit of the error function to the data. 
Each data point represents the average of five 
measurements, and error bars represent the stan- 
dard error of the mean. The inset shows the 
measured magnetic field as a function of time. 


we inferred that the extra loss for the down- 
ward ramp might be the result of the formation 
of triatomic molecules. However, we could not 
exclude the possibility that some unknown 
mechanism could contribute to these losses. 

Unambiguous evidence for the creation of 
triatomic molecules can be obtained by di- 
rectly probing the triatomic molecules through 
dissociating them into free diatomic mole- 
cules and atoms. The major challenge for dis- 
sociation is the short lifetime of the triatomic 
molecules, which is limited mainly by three 
loss mechanisms—i.e., predissociation, inelas- 
tic collisions with atoms and diatomic mole- 
cules, and photoexcitations by the trap laser 
(42-44). The triatomic molecules may decay 
into free atoms and diatomic molecules if their 
energies are higher than the lowest atom- 
molecule scattering threshold. This predis- 
sociation mechanism was expected to be 
suppressed by preparing the “°K atoms in the 
lowest hyperfine state |9/2, —9/2). For an open 
channel-dominated Feshbach resonance, the 
characteristic size of the weakly bound tri- 
atomic molecule is the scattering length, 
which is much larger than the characteristic 
size of the deeply bound triatomic molecule 
(39). In this case, the loss induced by collisions 
with the fermionic “°K atoms or 7?Na*°K mol- 
ecules was expected to be suppressed by the 
Pauli blocking principle. This is similar to the 
collision between weakly bound ®’Rb*°K mol- 
ecules and “°K atoms, where the Fermi sup- 
pression of collisional loss has been directly 
observed (45). The effect of photoexcitations 
by the trap laser was unknown. The obser- 
vation of photoexcitations of the collision 
complex in the *’Rb*°K + ®’Rb collisions (46) 
indicated that the trap laser might also excite 


the triatomic bound state. However, how 
strongly it would affect the lifetime of the 
triatomic bound state was not clear. 

In the experiment, we used an rf field to 
dissociate the triatomic molecules. By apply- 
ing an rf pulse driving the atomic transition 
\9/2, -9/2)|9/2, —7/2), the ?°Na*°K, tri- 
atomic molecules could be dissociated into 
free °Na*°K molecules and *°K atoms when 
the rf is larger than the sum of the atomic 
transition and the binding energy (Fig. 1B). 
To suppress the possible photoexcitations by 
the trap laser, we switched off the optical di- 
pole trap at 2.4 ms, when the magnetic field 
crossed the resonance. After 0.5 ms, when the 
magnetic field reached 46.30 G, we applied a 
0.2-ms rf pulse to dissociate the triatomic 
molecules. After that, the optical dipole trap 
was switched on to recapture ?*Na*°K mole- 
cules and “°K atoms. The “°K atoms were then 
removed by a resonant light pulse, and the 
number of ?*Na*°K molecules was detected by 
transferring them back to the Feshbach state 
at 77.6 G. During the application of the rf 
dissociation pulse, the magnetic field changed 
from 46.30 to 45.86 G. In this magnetic field 
window, the binding energy increased from 95 
to 201 kHz, and the atomic transition frequen- 
cy decreased from 13.230 to 13.115 MHz. These 
changes partially compensated for each other, 
and thus we used a single-frequency rf pulse to 
dissociate the triatomic molecules in this time 
window. The number of 7*Na*°K molecules as 
a function of the rf is shown in Fig. 4. We ob- 
served a clear asymmetric dissociation spec- 
trum with a threshold behavior. The ’Na*°K 
molecules lost in the magnetic-field sweep 
could be recovered by the rf dissociation pulse. 
This dissociation occurred only when the rf 
was larger than a threshold, which in our case 
was equal to the minimum of the sum of the 
frequency of the atomic transition and the 
binding energy in the magnetic-field window. 
The measured threshold frequency was smaller 
than the calculated threshold frequency by 
~20 kHz. We attributed this small difference 
to the systematic errors resulting from the dif- 
ferent fitting models. The number of 7*Na*°K 
molecules dissociated from the triatomic mol- 
ecules could be obtained by subtracting the 
background signal resulting from the mole- 
cules that did not participate in the magneto- 
association process. A maximum value of ~4000 
could be achieved, which means that at least 
~30% of the ??Na*°K molecules could be con- 
verted into triatomic molecules by magneto- 
association. The observation of the dissociation 
spectrum with a threshold behavior provides 
clear evidence for the creation of ultracold 
triatomic molecules. Assuming that the temper- 
ature of the triatomic molecular gas was the 
same as the atom-diatomic molecule mixture 
and that the polarizability of the triatomic mo- 
lecule was equal to the sum of the polarizability 
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Fig. 4. The rf dissociation spectrum of the triatomic molecules. (A) The number of “*Na*°K molecules 
as a function of rf. The data were fitted to the dissociation spectrum. The red dashed line represents the 
frequency of the atomic transition at 46.30 G. The blue dashed line is the fitted threshold dissociation 
frequency. When the frequency of the rf field is larger than this threshold, the triatomic molecules can be 
dissociated into “*Na*°K molecules and °K atoms. For this measurement, the optical dipole trap (ODT) was 
switched off at 2.4 ms, and the 0.2-ms rf dissociation pulse was applied at 2.9 ms. Each data point 
represents the average of three measurements. (B) The magnetic field as a function of time. The black 
dashed line denotes the time at which the optical dipole trap is switched off. The rf pulse was applied in 
the time window between the two red dashed lines. (C) The number of 7°Na*°K molecules as a function of the 
time at which the optical dipole trap is switched off. The black points represent the data after applying 


the rf dissociation pulse. The red points represent the background data without applying the rf pulse. The 
difference between them gave the number of @°Na*°K molecules dissociated from the triatomic molecules. 
The optical dipole trap started to affect the formation of triatomic molecules at ~2.3 ms, when the magnetic 


field reached 48.91 G, slightly above the resonance position. Each data point represents the average of 


three measurements. Error bars represent the standa 


of ?2Na*°K molecules and “°K atoms, we ob- 
tained a peak density of ~3 x 10" cm™®. There- 
fore, the triatomic molecular gas had a high-peak 
phase-space density of ~0.05, which is ~10 
orders of magnitude higher than the previous 
results for cold polyatomic molecules (J5, 17). 
We found that it was critically important to 
immediately switch off the trap laser when 
the magnetic field crossed the resonance be- 
cause the presence of the trap laser could sub- 
stantially deplete the triatomic molecules. To 
see this effect clearly, we fixed the dissociation 
frequency and changed the time at which the 
optical dipole trap was switched off. The opti- 
cal dipole trap started to affect the formation 
of the triatomic molecules when the magnetic 
field reached ~48.91 G, which is slightly above 
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the resonance position (Fig. 4). The dissociation 
signals could not be observed when the optical 
dipole trap was present for a few hundred micro- 
seconds, which indicated that the lifetime of 
the triatomic molecules in the optical dipole 
trap was short. This was the direct observation 
that the trap laser could excite the triatomic 
molecules. Assuming that the triatomic mol- 
ecules were created when the magnetic field 
crossed the resonance, the triatomic molecules 
coexisted with the “K atoms and ’Na*°K mol- 
ecules for a few hundred microseconds before 
the optical dipole trap was switched on. During 
this interval, the triatomic molecules may be 
transferred to a more stable state by lasers. 
We have created an ultracold gas of ?Na“°K, 
triatomic molecules from a mixture of 7?Na“°K 


molecules and “°K atoms by adiabatic magneto- 
association. The trap laser had strong detri- 
mental effects on the formation of triatomic 
molecules. We will explore whether the photo- 
excitation problem can be solved by changing 
the wavelength of the laser in the future. The 
photoexcitation problem can also be solved by 
transferring the triatomic molecules into a 
magnetic trap because the triatomic mole- 
cules were in a weak field-seeking state. The 
triatomic molecules that could be dissociated 
by the rf field were weakly bound molecules 
near the atom-molecule scattering threshold. 
They could be transferred to closed-channel 
molecules in a high vibrational state by simply 
ramping the magnetic field to zero value, where 
the binding energy was estimated to be on the 
order of 100 MHz (9). 

The creation of ultracold triatomic molec- 
ular gases opens up many research possibi- 
lities. The triatomic molecules are in an 
excited vibrational state. This state can serve 
as a launchpad for the full control of the en- 
ergy and configuration of the triatomic mole- 
cules and the probe of the three-body potential 
energy surface with unprecedented resolution. 
The triatomic molecules may be transferred to 
the rovibrational ground state by lasers, sim- 
ilar to diatomic molecules. The phase-space 
density of the triatomic molecular gas is great- 
ly improved from that in previous works, and 
further improvement may enable the creation 
of quantum degenerate gases or Bose-Einstein 
condensates of triatomic molecules. 


REFERENCES AND NOTES 


1. L.D. Carr, D. DeMille, R. V. Krems, J. Ye, New J. Phys. 11, 
055049 (2009). 

2. G. Quéméner, P. S. Julienne, Chem. Rev. 112, 4949-5011 
(2012). 

3. L. De Marco et al., Science 363, 853-856 (2019). 

4. A. Schindewolf et al., Nature 607, 677-681 (2022). 

5. J. Cao et al., Preparation of a quantum degenerate mixture of 
?3Na*°K molecules and “°K atoms. arXiv:2208.09620 [physics. 
atom-ph] (2022). 

6. G. Valtolina et al., Nature 588, 239-243 (2020). 

7. K. Matsuda et al., Science 370, 1324-1327 (2020). 

8. H. Yang et al., Science 363, 261-264 (2019). 

9. X.-Y. Wang et al., New J. Phys. 23, 115010 (2021). 

0. H. Son et al., Science 375, 1006-1010 (2022). 

1. M.-G. Hu et al., Science 366, 1111-1115 (2019). 

2. E. S. Shuman, J. F. Barry, D. Demille, Nature 467, 820-823 

(2010). 

3. L. Caldwell et al., Phys. Rev. Lett. 123, 033202 (2019). 

4. S. Ding, Y. Wu, |. A. Finneran, J. J. Burau, J. Ye, Phys. Rev. X 10, 

021049 (2020). 

5. M. Zeppenfeld et al., Nature 491, 570-573 (2012). 

6. D. Mitra et al., Science 369, 1366-1369 (2020). 

7. N. B. Vilas et al., Nature 606, 70-74 (2022). 

8. C. A. Regal, C. Ticknor, J. L. Bohn, D. S. Jin, Nature 424, 47-50 

(2003). 

9. J. Herbig et al., Science 301, 1510-1513 (2003). 

20. K.-K. Ni et al., Science 322, 231-235 (2008). 

21. P. K. Molony et al., Phys. Rev. Lett. 113, 255301 (2014). 

22. T. Takekoshi et al., Phys. Rev. Lett. 113, 205301 (2014). 

23. J. W. Park, S. A. Will, M. W. Zwierlein, Phys. Rev. Lett. 114, 
205302 (2015). 

24. M. Guo et al., Phys. Rev. Lett. 116, 205303 (2016). 

25. T. M. Rvachov et al., Phys. Rev. Lett. 119, 143001 
(2017). 

26. F. SeeBelberg et al., Phys. Rev. A 97, 013405 (2018). 

27. K. K. Voges et al., Phys. Rev. Lett. 125, 083401 (2020). 


science.org SCIENCE 


RESEARCH | REPORTS 


28. |. Stevenson et al., Ultracold gas of dipolar NaCs ground state 
molecules. arXiv:2206.00652 [cond-mat.quant-gas] (2022). 

29. J. Pérez-Rios, M. Lepers, O. Dulieu, Phys. Rev. Lett. 115, 073201 
(2015). 

30. J. Schnabel, T. Kampschulte, S. Rupp, J. Hecker Denschlag, 
A. Kéhn, Phys. Rev. A 103, 022820 (2021). 

31. R. Hermsmeéier, J. Ktos, S. Kotochigova, T. V. Tscherbul, 
Phys. Rev. Lett. 127, 103402 (2021). 

32. T. Kraemer et al., Nature 440, 315-318 (2006). 

33. C. Chin et al., Phys. Rev. Lett. 94, 123201 (2005). 

34. T. Lompe et al., Science 330, 940-944 (2010). 

35. C. E. Klauss et al., Phys. Rev. Lett. 119, 143401 (2017). 

36. H. Yang et al., Nature 602, 229-233 (2022). 

37. Z. Su et al., Phys. Rev. Lett. 129, 033401 (2022). 

38. A. D. Lange et al., Phys. Rev. A 79, 013622 (2009). 

39. C. Chin, R. Grimm, P. Julienne, E. Tiesinga, Rev. Mod. Phys. 82, 

225-1286 (2010). 

40. T. Kohler, K. Goral, P. S. Julienne, Rev. Mod. Phys. 78, 

311-1361 (2006). 

41. X.-Y. Wang, “Association of triatomic molecules by ultracold 

4K-?3Na“°K atom-molecule Feshbach resonances,” thesis, 

stitute of Chemistry, Chinese Academy of Sciences (2022). 


SCIENCE science.org 


42. A. Christianen, M. W. Zwierlein, G. C. Groenenboom, T. Karman, 
Phys. Rev. Lett. 123, 123402 (2019). 

43. P. D. Gregory, J. A. Blackmore, S. L. Bromley, S. L. Cornish, 
Phys. Rev. Lett. 124, 163402 (2020). 

44, Y. Liu et al., Nat. Phys. 16, 1132-1136 (2020). 

45. J. J. Zirbel et al., Phys. Rev. Lett. 100, 143201 (2008). 

46. M. A. Nichols et al., Phys. Rev. X 12, 011049 (2022). 

47. H. Yang et al., Data of creation of an ultracold gas of triatomic 
molecules, dataset, Zenodo (2022); https://doi.org/10.5281/ 
zenodo.7247988. 


ACKNOWLEDGMENTS 


Funding: This work was supported by the National Key R&D 
Program of China (under grant no. 2018YFA0306502), the 
National Natural Science Foundation of China (under grant nos. 
11904355 and 12274393), the Chinese Academy of Sciences, 
the Anhui Initiative in Quantum Information Technologies, the 
Shanghai Municipal Science and Technology Major Project (grant 
no. 2019SHZDZX01), the Shanghai Rising-Star Program (grant 
no. 20QA1410000), and the Innovation Program for Quantum 
Science and Technology (grant no. 2021ZD0302101). Author 
contributions: B.Z. and J.-W.P. conceived the experiment. H.Y., 


J.C., Z.S., and J.R. carried out the experiment. All authors 
contributed to the analysis of the data and to the writing of the 
manuscript. B.Z. and J.-W.P. supervised the project. Competing 
interests: The authors declare no competing interests. Data 
and materials availability: All data needed to evaluate the 
conclusions in the paper are present in the paper or the 
supplementary materials. All data presented in this paper are 
deposited at Zenodo (47). License information: Copyright © 2022 
the authors, some rights reserved; exclusive licensee American 
Association for the Advancement of Science. No claim to original 
US government works. https://www.science.org/about/science- 
licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.ade6307 
Materials and Methods 

Fig. Sl 

References (48-50) 


Submitted 29 August 2022; accepted 2 November 2022 
10.1126/science.ade6307 


2 DECEMBER 2022 + VOL 378 ISSUE 6623 1013 


LIFE SCIENCE TECHNOLOGIES 


new products: dna/rna analysis 


"i Nucleic Acid Extraction Kits for 
Pathogen Detection 

AMS Biotechnology has added several new 
products to its growing range of magnetic 
bead-based nucleic acid extraction kits 

_ for pathogen detection. The MagSi-NA 
Pathogens MSP Kit overcomes false negative results by using a 
nondilutive technique to sequentially capture all viral RNA present 
in any of up to six samples, all in a single magnetic bead pellet. 
The automation-ready rQ MagSi-NA Pathogens Kit, consisting of 
conveniently prefilled deep-well plates, will also help labs increase 
throughput and allow for the preparation of up to 96 samples in 
less than 20 min. In addition, the new MagSi-DX Pathogens Kit has 
been specifically validated for SARS-CoV-2 diagnostic workflow and 
is ideal for labs requiring CE-in vitro diagnostic (IVD) products for 
use in human diagnostics. CE-lVD marked versions of MagSi-NA 
Pathogens MSP and rQ MagSi-NA Pathogens are scheduled for 
release in the coming months. 

AMS Biotechnology 

For info: +44-(0)-1235-828200 
www.amsbio.com/magsi-na-pathogens-kit 


Cas13a Nuclease (Lyophilized) 

Cas13a is a class Il and type VI CRISPR system effector protein with 
two higher eukaryotes and prokaryotes nucleotide-binding (HEPN) 
domains. It is anovel CRISPR protease that can be used for targeted 
RNA cleavage. While Cas13a recognizes and cleaves target RNA under 
the guidance of guide RNA, its collateral cleavage activity is activated, 
which can efficiently cleave nonspecific single-stranded RNA (ssRNA). 
The lyophilized version of Cas13a can be transported at room 
temperature, saving the high cost of dry ice transportation. 

Beijing SBS Genetech 

For info: +86-(0)-10-62969345 
www.sbsgenetech.com/store/products/cas13a-nuclease-lyophilized 


All-In-One Gene Expression System 

Inducible expression vectors are an essential tool in life science 
research for controlled gene expression. However, transfecting two 
plasmids and background expression/leakiness in the off state are 
the primary concerns for such a system. OriGene has created a new 
and improved All-in-one Tet-On system—an upgraded version of the 
original system designed to stimulate the expression of the gene of 
interest. It has a tetracycline (Tet)-on 3G transactivation and a tightly 
regulated tetracycline-responsive element (TRE) promoter in one 
vector, making it an ALL-In-One system. Its key features include: (1) 
the presence of the transactivator and the promoter in the same 
vector, eliminating the need for another plasmid; (2) a modified 3G 
transactivator for high doxycycline sensitivity; and (3) a high level of 
induction with low leakiness. The All-in-One Tet-On system makes 
your experiments convenient and time-efficient. 

OriGene 

For info: 1-888-267-4436 

www.origene.com 


Produced by the Science/AAAS Custom Publishing Office 


In Situ Hybridization Probes 

BioGenex offers a comprehensive range of in situ hybridization 
(ISH) probes. In situ hybridization technique is used for precise 
detection and localization of a specific nucleic acid sequence within 
tissues and cells in molecular diagnostics of genetic variation. The 
underlying principle of ISH is that the nucleic acids within tissue 
and cell specimens can be detected by the hybridization of a 
complementary nucleic acid probe to which a reporter molecule is 
attached. Common nonradioactive labels for probes are fluorescein 
and digoxigenin. These probes allow quantitative detection of 
specific DNA/RNA sequences in their native form within the cells 

of formalin-fixed paraffin-embedded (FFPE) tissue sections. These 
probes offer reliable, highly sensitive, and easy-to-perform DNA and 
RNA ISH assays when used with BioGenex ISH Detection Systems. 
BioGenex 

For info: 1-800-421-4149 

biogenex.com/ish-probes 


Real-Time PCR System 

Bio-Rad Laboratories announces the launch of the CFX Duet Real- 
Time PCR System to support researchers in developing singleplex 
and duplex quantitative PCR (qPCR) assays. The CFX Duet System 
offers the robust thermal performance and proprietary, accurate 
optical shuttle system of Bio-Rad’s CFX Opus System, with thermal 
gradient functionality to enable optimization in fewer runs. The CFX 
Duet is a two-color system that is factory calibrated for common 
dyes and, without the need for passive reference dyes, allows 

the precise quantification of up to two targets in genotyping and 
multiple gene expression analyses. An additional fluorescence 
resonance energy transfer (FRET) mode supports protein melt 
analysis for basic protein characterization. The system utilizes CFX 
Maestro Software to provide easy experimental setup and analysis 
and to deliver customized reports without exporting data to other 
programs. 

Bio-Rad Laboratories 

For info: 1-800-424-6723 
bio.rad.com/en-uk/product/cfx-duet-real-time-pcr-system 


Improved Viral RNA Extraction 

The Chromatrap Homogenizer Spin Column from Porvair Sciences 
maximizes the yield and quality of viral RNA extracted from nasal 
and throat swab samples. The easy-to-use spin column offers a 
fast, efficient one-stop alternative to using traditional syringe and 
needle homogenization techniques. Its novel dual-frit extraction 
design reduces lysate viscosity and captures insoluble debris by 
centrifugation; the homogenized lysate sample is then ready for 
RNA extraction. Employed as a single-use consumable, the column 
eliminates the possibility of sample cross-contamination. It is fully 
compatible with all manual RNA extraction kits and provides an ideal 
sample preparation solution for RNA miniprep and midiprep. 
Porvair Sciences 

For info: +44-(0)-1978-666222 
www.chromatrap.com/homogeniser-spin-column 
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As part of a four-faculty line expansion, the Department of Biology at Saint Louis 
University (https://www.slu.edu/arts-and-sciences/biology/index.php) invites appli- 
samtcourueeeeer, cations for four tenure-track Assistant/Associate Professor positions in the areas of 


A career plan customized — Microbiology, Biochemistry, Neuroscience and Developmental Biology. 


for you, by you. 


Microbiologist: research should address fundamental questions in areas including, but not limited to 


; bacteriology, virology, mycology, microbial diversity and evolution, microbial ecology, disease ecology, 
host-pathogen interactions, emerging infectious disease, or antimicrobial resistance. 

Biochemist: research should address fundamental questions in areas including, but not limited to, 
biochemistry, structural biology, enzymology, metabolism, biophysics, and/or biochemical pharmacology. 
Developmental Biologist: research should address fundamental questions in areas including, but not limited 
to: stem cell biology, cell fate specification, aging, growth and regeneration, developmental genomics, 
evolution and development, and models of developmental disorders. 

Neuroscientist: research should address fundamental questions in areas including, but not limited to, 
developmental neuroscience, neurophysiology, neuroendocrinology, neurochemistry, neuroinflammation, and/ 
or neurodegeneration. 


online @sciencecareers.org 


There's only one Science 


Successful candidates will be expected to: (a) establish an externally funded research program, and mentor both 
undergraduate and graduate (M.S. and Ph.D.) students; (b) support the service charge of the university; and, (¢) 
contribute to teaching in general biology courses and courses related to their discipline. 


Features in myIDP include: 


= Exercises to help you examine your 


skills, interests, and values. Excellent facilities and a competitive start-up package are provided. Many opportunities for collaboration 
: eer are available on campus, with researchers at SLU School of Medicine, and at world class institutions 
« Alist of 20 scientific career paths based in Saint Louis including the Missouri Botanical Gardens, Donald Danforth Plant Science Center, Saint 
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Candidates must have a doctoral degree and postdoctoral experience. Demonstrated experience in diversity, 
equity, and inclusion is desired. 
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Visit the website and start 
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Please go to this link and search for “biology” https://slu.wd5.myworkdayjobs.com/Careers 

When prompted to upload your Resume/CV on the “My Experience” page, we ask that applicants submit the 
following documents: 

* Cover letter (should list at least three references with contact information) 

* Updated curriculum vitae 

* Statement of research accomplishments and future research plans 

* Statement of teaching and mentoring philosophy 

+ Diversity, equity, and inclusion statement 
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regard to race, color, religion, sex, age, national origin, disability, marital status, sexual orientation, 
military/veteran status, gender identity, or other non-merit factors. We welcome and encourage applications 
from minorities, women, protected veterans, and individuals with disabilities (including disabled veterans). 
If accommodations are needed for completing the application and/or with the interviewing process, 
please contact Human Resources at 314-977-5847. 
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UCLA Ecology & Evohitionary Biology _ ; Shenzhen Institute of 


TENURE TRACK ASSISTANT PROFESSOR POSITION Advanced Technology t 
IN ECOLOGY AND EVOLUTIONARY BIOLOGY Ti I Chinses Academy ot Sciences | 


The Department of Ecology and Evolutionary Biology (EEB) at UCLA is 


searching for a tenure track Assistant Professor that complements their 
research program with an outstanding commitment to equity and inclusion 
through research, mentorship, teaching, community engagement and/or service. 
Candidates in any area of ecology or evolutionary biology, broadly defined, 
will be considered. We particularly encourage applications from women, 
underrepresented minorities, and individuals with a commitment to mentoring 
underrepresented groups in the sciences. Necessary qualifications include a 
PhD degree in a relevant discipline and a strong background in quantitative 
methods. Evidence of sustained research productivity through journal 
publications is required. The successful candidate will be expected to develop 
an externally funded research program, and teach and mentor at both the 
undergraduate and graduate levels employing inclusive pedagogical 
approaches. We are especially interested in candidates that can complement 
existing departmental strengths in inclusive education, and develop courses 
that satisfy the UCLA undergraduate diversity requirement and/or courses that 
include topics of diversity, equity and inclusion and how they relate to the fields 
of ecology and evolutionary biology. 


Application packages should be submitted online through https://recruit. 
apo.ucla.edu/JPF07968 and include the following individual documents: 
1) curriculum vita; 2) research statement including future directions (2 pages 
maximum); 3) teaching statements that include teaching interests as well as 
experience employing pedagogies that promote active learning and inclusive 
teaching practices (1 page maximum); 4) statement of contributions to equity, 
diversity, and inclusion that includes previous and planned efforts that advance 
EDI through formal and/or informal mentoring, research or education activities 
(no page limit); and 5) cover letter that includes names of three referees who 
can be contacted for letters (reference letters will only be requested of 
select candidates; 1 page maximum). Review of applications will begin on 
December 10, 2022 and continue until the position is filled. Inquiries about 
the position should be sent to search committee chair Professor Karen Sears 
(ksears@ucla.edu). 


The University of California is an Equal Opportunity/Affirmative Action 
Employer. See full recruitment ad on https://recruit.apo.ucla.edu/JPF07968. 


Established in partnership between the Chinese Acade- 
my of Sciences and the Shenzhen Municipal Govern- 
ment, the Shenzhen Institute of Advanced Technology 
(SIAT) is a newly-created university with an objective to 
become the world's preeminent institute for emerging 
science and engineering programs. SIAT is equipped 
with state-of-art teaching and research facilities and is 
dedicated to cultivating international, visionary, and in- 
terdisciplinary talents while delivering research support 
to pursue innovation-driven development. 


SIAT is located in Shenzhen, also known as the "Silicon 
Valley of China,” a modern, clean, and green city, 
well-known for its stunning architecture, vibrant econo- 
my, and its status as a leading global technology hub. 
SIAT is seeking applications for faculty positions of all 
ranks in the following academic programs: Computer 
Science and Engineering, Bioinformatics, Robotics, 
Life Sciences, Material Science and Engineering, Bio- 
medical Engineering, Pharmaceutical Sciences, Syn- 
thetic Biology, Neurosciences, etc. SIAT seeks individ- 
uals with a strong record of scholarship who possess 
the ability to develop and lead high-quality teaching 
and research programs. SIAT offers a comprehensive 
benefits package and is committed to faculty success 
throughout the academic career trajectory, providing 
support for ambitious and world-class research proj- 
ects and innovative, interactive teaching methods. 


Further information: 


Who's the top 
employer for 2022? 


Qa) a Science Careers’ annual survey reveals the top companies 
« in biotech & pharma voted on by Science readers. 


| dd Read the article at 


scicnce2022 | sciencecareers.org/topemployers 


CLEMS@#N 


ENDOWED CHAIR IN 
MEDICAL BIOPHYSICS 


CLEMSON UNIVERSITY 
CLEMSON, SOUTH CAROLINA, USA 


Clemson University invites leading scholars to apply to become 
the founding holder of the Dr. Waenard L. Miller, Jr. ’69 and 
Sheila M. Miller Endowed Chair in Medical Biophysics. 


= Register for a free online account on ScienceCareers.org. 


= Search hundreds of job postings and find your perfect job. The successful candidate will receive a salary commensurate with 


experience, comprehensive resources and benefits, and faculty 
appointments to build innovative programs that elevate Clemson 
University’s prominence in medical biophysics. Candidates will 
&. 'Watehoie of aul wiany webinars oneitiaent career tonics possess a vision to tackle major challenges in medical biophysics 
such as job searching, networking, and more. that will impact our fundamental knowledge of human health. 


= Sign up to receive e-mail alerts about job postings that 
match your criteria. 


= Upload your resume into our database and connect 
with employers. 


Scan the QR code or 
visit apply.interfolio.com/97767 
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SCIENCECAREERS.ORG 


CHANGE 
YOUR JOB 
AND YOU 
JUST MIGHT 


CHANGE 
THE WORLD. 


C | Pinere’ 's no better or more trusted 
authority. Get the scoop, stay in the loop with Science Careers. 
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WORKING LIFE 


By Alexandra Ridgway 
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Writing for the moment 


desperately needed to write. Eighty percent of the way through my Ph.D., with revisions to make and 
two more chapters to complete, I felt mounting pressure to get words on paper. When I was a child, 
the words would have come easily. I wrote inspired and quickly, churning out poem after poem for 
our local newspaper. But the rigid essay structures of high school and university extinguished my 
love of writing. I would sit at my computer late into the night, forcing the words out at turtle speed. 
Writing remained slow and laborious during graduate school—until I became a mum to twins. 


Before my twins were born, I had 
envisioned tinkering away at my 
thesis while the babies slept bliss- 
fully in their bassinets. My univer- 
sity offered maternity leave, but 
I had chosen not to take it. I was 
nearly done with my doctorate 
and did not want to wait. I figured 
things would only get trickier as 
the twins got older. 

But from the day my twins were 
born, I felt I had no time to do 
anything besides care for them, 
let alone write. My vision of bas- 
sinet naps was an illusion; the 
babies slept on my chest, in the 
pram, or not at all. One morning, 
I timed how long my son would 
sleep alone. The result: 6 minutes. 
How on earth could I complete 
my Ph.D. in such small snippets 
of time? With every day that went 
by without progress, I increas- 
ingly felt destined to become ABD: all-but-dissertation. 

After my fingers came nowhere close to a keyboard for 
6 weeks, I decided enough was enough; I would just have 
to use the rare minutes I had. I set up my work space, 
with my thesis cued up and ready to go. As soon as my ba- 
bies drifted off, I flew to my computer, a sentence already 
formed in my mind, determined to see whether I could 
land those words on the page before the babies realized 
I was gone. 

I managed to write a full sentence and a half before I 
heard whimpers from the other room. Part of me wished 
I had been able to stay at the computer longer, but I was 
nonetheless exhilarated by my incremental progress. 
Maybe this really could be the way forward. 

In the weeks that followed, I continued what I dubbed 
“microwriting.” Sometimes a baby would wake even before 
I made it to the computer. Those days filled me with fears 
that I would never finish my thesis and that I was fooling 
myself to think I could both care for my children and pur- 


“As | learned to savor my limited 
slivers of writing time, 
my childhood love for it returned.” 


sue my studies. But those moments 
also hardened my resolve; I had no 
choice but to try again tomorrow. 
Over time, a sentence gradually 
turned into a paragraph and then a 
page. Even when I only got one word 
down, it was still one step forward. 
Although it was _ frustrating 
to never know whether I would 
make progress on any given day, 
I learned to live for the moment. 
Never had I been more present— 
with my writing, but also as a 
mother. Whichever role I was in, 
I was immersed in it completely. 
And as I learned to savor my lim- 
ited slivers of writing time, my 
childhood love for it returned. 
Over time, those slivers grew. 
My husband was able to reduce his 
working hours and kind friends 
walked the babies in the pram 
while I furiously typed away. When 
6 minutes became 60, I was astonished at what I could do. 
And when I found myself falling back into my old plodding 
ways, I put the timer on for 6 minutes as a reminder of how 
valuable this time was. Bit by bit, I typed my way to submit- 
ting my thesis when the babies were about 3 months old. 
By no means am I recommending that others forgo pa- 
rental leave or try to race to the finish line as I did. But 
microwriting, which began as a necessity, has become core 
to my writing practice. The twins are now 3 years old and 
although I have more opportunities for extended focus, in- 
terruptions are still commonplace. But I’ve learned there 
is never a perfect time to write. I have to carve out mo- 
ments when I can, no matter how small they may be, and 
cherish each opportunity to get words on paper. The time 
to write is now. 


Alexandra Ridgway is a research assistant at the Royal Melbourne 
Institute of Technology University and a fellow at the University 
of Hong Kong. Send your career story to SciCareerEditor@aaas.org. 
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ILLUSTRATION: ROBERT NEUBECKER 


» 


Recognizing the v of an early career scientist who has 
performed outstanding research in the field of cancer. Award 
nominees must have received their Ph.D. or M.D. within the last 


10 years. The winner will deliver a public lecture on their research, 
AAAS MARTIN AND receive a cash award of $25,000, and publish a Focus article on 

their award-winning research in Science Translational Medicine. 
ROSE WACHTEL 


CANCER RESEARCH For more information visit 


www.aaas.org/aboutaaas/awards/wachtel 


AWA R D or e-mail wachtelprize@aaas.org. 
Deadline for submission: February 1, 2023. 


MNAAAS — Science Translational Medicine 


It’s time to think 


Find out how NEB can support your infectious disease research and development. 


Gaining a better understanding of infectious diseases, | ae. 
including their characterization, evolution and transmission, 

continues to be a priority, both from an R&D standpoint 

and as a public health issue. The COVID-19 pandemic has : 
demonstrated the need for a wide range of tools to research | | 
infectious diseases, and has highlighted the importance 

of speed and the ability to pivot as new problems arise. 

This has emphasized the need for innovation and thinking 

differently about where to access those critical materials, 

including genomics reagents. 


Many scientists know NEB as a trusted reagent supplier to 
the life science community, but what you may not know is 
that we also offer a portfolio of products that can be used 
in infectious disease research, development of diagnostics 
and therapies, and in epidemiological studies and disease 
surveillance. In fact, many of our products have supported 
the development of COVID-19 diagnostics and vaccines, 
and can also be utilized with other infectious diseases, such 


as influenza and malaria. 


“GMP-grade” is a branding term NEB uses to describe reagents manufactured at our Rowley, MA facility, where we utilize procedures and process controls to 
manufacture reagents in compliance with ISO 9001 and ISO 13485 quality management system standards. NEB does not manufacture or sell products known 
as Active Pharmaceutical Ingredients (APIs), nor do we manufacture products in compliance with all of the Current Good Manufacturing Practice regulations. 


One or more of these products are covered by patents, trademarks and/or copyrights owned or controlled by New England Biolabs, Inc. For more information, 


please email us at busdev@neb.com. The use of these products may require you to obtain additional third party intellectual property rights for certain applications. 


© Copyright 2022, New England Biolabs, Inc,; all rights reserved. 


differently. 


Benefit from almost 50 years of experience in 
molecular biology & enzymology 


Partner with our OEM & Customized Solutions 
team to find the best solution for your needs 


Take advantage of our expanded manufacturing 
capabilities 


Access product formats, such as GMP-grade’, 
lyophilized, lyo-ready and glycerol-free 


Be confident in your product performance with 
our expanded quality and regulatory systems 


Ready to get started? Learn more at 
www.neb.com/InfectiousDiseases 


NEW ENGLAND 


ioLabs:.. 


be INSPIRED 
drive DISCOVERY 
stay GENUINE 


